×

Creating statistical language models for spoken CAPTCHAs

  • US 8,949,126 B2
  • Filed: 04/21/2014
  • Issued: 02/03/2015
  • Est. Priority Date: 06/23/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method of constructing and training a statistical language model (SLM) to identify machine utterances of text in a system for audio Completely Automated Public Turing Tests to Tell Computers and Humans Apart (CAPTCHA), comprising:

  • automatically preparing a plurality of candidate challenge items with a computing system, each of the candidate challenge items including one or more words or phrases selected from a document corpus;

    causing selected ones of the plurality of candidate challenge items to be articulated by at least one machine text-to-speech (TTS) system as candidate articulations;

    ranking the candidate articulations based on a human listener score attributed to such candidate articulations, which human listener score identifies at least whether a candidate articulation originated from a machine; and

    training the SLM to recognize machine TTS articulations based on selecting candidate articulations according to said ranking, such that a subset of said plurality of candidate challenge items identified as originating from a machine are used as a seed set in spoken CAPTCHA.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×