System and method for verifying origin of input through spoken language analysis

US 8,489,399 B2
Filed: 06/15/2009
Issued: 07/16/2013
Est. Priority Date: 06/23/2008
Status: Active Grant

First Claim

Patent Images

1. A method of controlling access to a computing system comprising:

selecting first test text data to be articulated as a first speech utterance by a first entity providing input to the computing system based on measurable differences in articulation between a human speaker and a machine for the first test text data exceeding a target threshold;

storing a voice print for the first entity at the computing system based on said first speech utterance being converted into recognized speech data;

wherein said first entity can include either a human or a computer using a synthesized voice;

receiving a second speech utterance by a second entity;

processing said second recognized speech data with said computing system to determine whether said second speech utterance also originated from said first entity; and

controlling whether said second entity is allowed to access an account and/or data based on comparing said voice print to said second recognized speech data.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.

149 Citations

View as Search Results

19 Claims

1. A method of controlling access to a computing system comprising:
- selecting first test text data to be articulated as a first speech utterance by a first entity providing input to the computing system based on measurable differences in articulation between a human speaker and a machine for the first test text data exceeding a target threshold;
  
  storing a voice print for the first entity at the computing system based on said first speech utterance being converted into recognized speech data;
  
  wherein said first entity can include either a human or a computer using a synthesized voice;
  
  receiving a second speech utterance by a second entity;
  
  processing said second recognized speech data with said computing system to determine whether said second speech utterance also originated from said first entity; and
  
  controlling whether said second entity is allowed to access an account and/or data based on comparing said voice print to said second recognized speech data.
- View Dependent Claims (2)
- - 2. The method of claim 1, wherein said access is used for one or more of the following:
    - a) establishing an online account; and
      
      /orb) accessing an online account; and
      
      /orc) establishing a universal online ID; and
      
      /ord) accessing a universal online ID; and
      
      /ore) sending email; and
      
      /orf) accessing email; and
      
      /org) posting on a message board; and
      
      /orh) posting on a web log; and
      
      /ori) posting on a social network site page;
      
      j) buying or selling on an auction site; and
      
      /ork) posting a recommendation for an item/service; and
      
      /orl) selecting an electronic ad.

3. A method of identifying a source of data input to a computing system using prosodic elements of speech comprising:
- a) presenting a challenge item to an entity, which challenge item is associated with a reference set of words and associated reference prosodic scores;
  
  b) receiving speech utterance from an entity related to said challenge item including an input set of words;
  
  c) processing said speech utterance with said computing system to compute input prosodic scores of said input set of words;
  
  d) comparing said input prosodic scores and said reference prosodic scores; and
  
  e) generating a determination of whether said speech utterance originated from a machine or a human based on step (d);
  
  wherein said challenge item is supplemented with visual cues, said visual cues being adapted to induce said reference prosodic scores.
- View Dependent Claims (4, 5)
- - 4. The method of claim 3 further including the step:
    - recognizing said input set of words to compute an additional prosodic score based on an identity of said input set of words, and comparing said additional prosodic words to a second reference prosodic score related to a content of said reference set of words.
  - 5. The method of claim 3, wherein said visual cues are selected from a database of visual cues determined by reference to a database of human vocalizations to most likely result in said reference prosodic scores.

6. A method of identifying a source of data input to a computing system using prosodic elements of speech comprising:
- a) presenting a challenge item to an entity, which challenge item is associated with a reference set of words and associated prosodic characteristics;
  
  b) receiving speech utterance from an entity related to said challenge item;
  
  wherein said reference set of words represents a selected set of one or more contiguous words which when vocalized have a measurable difference in prosodic characteristics between a reference human voice and a reference computer synthesized voice that exceeds a target threshold;
  
  c) processing said speech utterance with said computing system to compute first prosodic characteristics of said entity;
  
  d) generating a determination of whether said speech utterance originated from a machine or a human based on step (c).
- View Dependent Claims (7, 8, 9, 10)
- - 7. The method of claim 6 further including the steps:
    - estimating a first computer synthesized voice that best correlates to said entity; and
      
      selecting said challenge item based on an identity of said first entity so as to maximize a difference in prosodic characteristics.
  - 8. The method of claim 6, further including steps:
    - soliciting utterances from a plurality of separate computing machines to determine their respective prosodic characteristics; and
      
      storing said plurality of associated prosodic characteristics in a database of known computing entities.
  - 9. The method of claim 8, wherein multiple samples of individual challenge sentences are collected.
  - 10. The method of claim 6 wherein visual cues are added to induce said entity to vocalize said reference set of words using said reference human voice.

11. A method of implementing a CAPTCHA (Completely Automatic Public Turing Test To Tell Humans And Computers Apart) to identify a source of data input to a computing system comprising:
- a) using a set of human test subjects to identify whether a reference challenge item was vocalized by a human or a computer;
  
  b) training the computing system with samples of human voices and computer synthesized voices articulating a set of the reference challenge items;
  
  c) receiving a speech utterance from an entity related to one of said set of reference challenge items;
  
  d) determining with the trained computer system whether said speech utterance was vocalized by a machine or a human.
- View Dependent Claims (12)
- - 12. The method of claim 11, wherein said reference challenge items are ranked and sorted according to a score provided by said human test subjects, and further including a step:
    - presenting said one of said set of reference challenge items based on said score.

13. A method of implementing a CAPTCHA (Completely Automatic Public Turing Test To Tell Humans And Computers Apart) to identify a source of data input to a computing system comprising:
- a) training the computing system with samples of human voices articulating a set of reference challenge items;
  
  b) receiving a speech utterance from an entity related to one of said set of reference challenge items;
  
  c) determining with the trained computer system whether said speech utterance was vocalized by a machine or a human;
  
  wherein said computing system uses one or more speech models that are optimized for identifying humans using said set of reference challenge items; and
  
  wherein said set of reference challenge items represent a selected set of one more contiguous words which when articulated have a difference in acoustical characteristics between a reference human voice and a reference computer synthesized voice that exceeds a target threshold as measured by a reference group of human listeners, and at least some of said acoustical characteristics are used to train said one or more speech models.
- View Dependent Claims (14)
- - 14. The method of claim 13, including a step:
    - using a set of human test subjects to identify whether a reference challenge item was vocalized by a human or a computer prior to using it in the training of the computing system.

15. A challenge system for identifying a source of data input to a computing system comprising:
- one or more software routines implemented in a computer readable medium andadapted to cause the challenge system to;
  
  select first test text data to be articulated as a first speech utterance by a first entity providing input to the computing system, based on measurable differences in articulation between a human speaker and a machine for the first test text data exceeding a target threshold;
  
  store a voice print for the first entity at the computing system based on said first speech utterance being converted into recognized speech data;
  
  wherein said first entity can include either a human or a computer using a synthesized voice;
  
  receive a second speech utterance by a second entity;
  
  process said second recognized speech data with said computing system to determine whether said second speech utterance also originated from said first entity;
  
  control whether said second entity is allowed to access an account and/or data based on comparing said voice print to said second recognized speech data.

16. A challenge system for identifying a source of data input to a computing system using prosodic elements of speech comprising:
- one or more software routines implemented in a computer readable medium and adapted to cause the challenge system to;
  
  a) present a challenge item to an entity, which challenge item is associated with a reference set of words and associated reference prosodic scores;
  
  b) receive speech utterance from an entity related to said challenge item including an input set of words;
  
  c) process said speech utterance with said computing system to compute input prosodic scores of said input set of words;
  
  d) compare said input prosodic scores and said reference prosodic scores; and
  
  e) generate a determination of whether said speech utterance originated from a machine or a human based on step (d);
  
  wherein said challenge item is supplemented with visual cues, said visual cues being adapted to induce said reference prosodic scores.

17. A challenge system for identifying a source of data input to a computing system using prosodic elements of speech comprising:
- one or more software routines implemented in a computer readable medium and adapted to cause the challenge system to;
  
  a) present a challenge item to an entity, which challenge item is associated with a reference set of words and associated prosodic characteristics;
  
  b) receive speech utterance from an entity related to said challenge item;
  
  wherein said reference set of words represents a selected set of one or more contiguous words which when vocalized have a measurable difference in prosodic characteristics between a reference human voice and a reference computer synthesized voice that exceeds a target threshold;
  
  c) process said speech utterance with said computing system to compute first prosodic characteristics of said entity;
  
  d) generate a determination of whether said speech utterance originated from a machine or a human based on step (c).

18. A system for implementing a CAPTCHA (Completely Automatic Public Turing Test To Tell Humans And Computers Apart) to identify a source of data input to a computing system comprising:
- one or more software routines implemented in a computer readable medium and adapted to cause the challenge system to;
  
  a) use a set of human test subjects to identify whether a reference challenge item was vocalized by a human or a computer;
  
  b) train the computing system with samples of human voices and computer synthesized voices articulating a set of the reference challenge items;
  
  c) receive a speech utterance from an entity related to one of said set of reference challenge items;
  
  d) determine with the trained computer system whether said speech utterance was vocalized by a machine or a human.

19. A system for implementing a CAPTCHA (Completely Automatic Public Turing Test To Tell Humans And Computers Apart) to identify a source of data input to a computing system comprising:
- one or more software routines implemented in a computer readable medium and adapted to cause the challenge system to;
  
  a) train the computing system with samples of human voices articulating a set of reference challenge items;
  
  b) receive a speech utterance from an entity related to one of said set of reference challenge items;
  
  c) determine with the trained computer system whether said speech utterance was vocalized by a machine or a human;
  
  wherein said computing system uses one or more speech models that are optimized for identifying humans using said set of reference challenge items wherein said set of reference challenge items represent a selected set of one more contiguous words which when articulated have a difference in acoustical characteristics between a reference human voice and a reference computer synthesized voice that exceeds a target threshold as measured by a reference group of human listeners, and at least some of said acoustical characteristics are used to train said one or more speech models.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Knapp Investment Company Limited
Original Assignee
John Nicholas and Kristin Gross Trust
Inventors
Gross, John Nicholas
Primary Examiner(s)
COLUCCI, MICHAEL C

Application Number

US12/484,837
Publication Number

US 20090319274A1
Time in Patent Office

1,492 Days
Field of Search

704/246, 705/18
US Class Current

704/260
CPC Class Codes

G06F 21/32   using biometric data, e.g. ...

G10L 13/027   Concept to speech synthesis...

G10L 13/08   Text analysis or generation...

G10L 15/02   Feature extraction for spee...

G10L 15/063   Training

G10L 15/22   Procedures used during a sp...

G10L 17/00   Speaker identification or v...

G10L 17/02   Preprocessing operations, e...

G10L 17/04   Training, enrolment or mode...

G10L 17/06   Decision making techniques;...

G10L 17/22   Interactive procedures; Man...

G10L 17/26   Recognition of special voic...

H04L 63/102   Entity profiles

H04L 63/123   received data contents, e.g...

H04M 2203/2027   Live party detection

System and method for verifying origin of input through spoken language analysis

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

149 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for verifying origin of input through spoken language analysis

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

149 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links