System and method for generating challenge utterances for speaker verification

US 9,318,114 B2
Filed: 11/24/2010
Issued: 04/19/2016
Est. Priority Date: 11/24/2010
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a username associated with an asserted identity from a user;

based on the username, accessing a user profile comprising voice characteristics;

identifying, using the voice characteristics, a plurality of asserted identity-specific more valuable phonemes determined to be more valuable than second phonemes for verifying the asserted identity;

generating a challenge sentence, based on the voice characteristics, wherein the challenge sentence is generated randomly according to one of a rule and a grammar, and wherein the challenge sentence comprises the plurality of asserted identity-specific more valuable phonemes;

prompting the user to speak the challenge sentence to yield a spoken challenge sentence;

comparing voice characteristics of the spoken challenge sentence to the voice characteristics of the user profile to yield an asserted identity voice score;

comparing the voice characteristics of the spoken challenge sentence with voice characteristics of a set of imposter identities to yield imposter identity voice scores; and

when the claimed identity voice score is within a threshold specific to the user profile, authenticating the user as the asserted identity.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and non-transitory computer-readable storage media relating to speaker verification. In one aspect, a system receives a first user identity from a second user, and, based on the identity, accesses voice characteristics. The system randomly generates a challenge sentence according to a rule and/or grammar, based on the voice characteristics, and prompts the second user to speak the challenge sentence. The system verifies that the second user is the first user if the spoken challenge sentence matches the voice characteristics. In an enrollment aspect, the system constructs an enrollment phrase that covers a minimum threshold of unique speech sounds based on speaker-distinctive phonemes, phoneme clusters, and prosody. Then user utters the enrollment phrase and extracts voice characteristics for the user from the uttered enrollment phrase. The system generates a user profile, based on the voice characteristics, for generating random challenge sentences according to a grammar.

47 Citations

View as Search Results

20 Claims

1. A method comprising:
- receiving a username associated with an asserted identity from a user;
  
  based on the username, accessing a user profile comprising voice characteristics;
  
  identifying, using the voice characteristics, a plurality of asserted identity-specific more valuable phonemes determined to be more valuable than second phonemes for verifying the asserted identity;
  
  generating a challenge sentence, based on the voice characteristics, wherein the challenge sentence is generated randomly according to one of a rule and a grammar, and wherein the challenge sentence comprises the plurality of asserted identity-specific more valuable phonemes;
  
  prompting the user to speak the challenge sentence to yield a spoken challenge sentence;
  
  comparing voice characteristics of the spoken challenge sentence to the voice characteristics of the user profile to yield an asserted identity voice score;
  
  comparing the voice characteristics of the spoken challenge sentence with voice characteristics of a set of imposter identities to yield imposter identity voice scores; and
  
  when the claimed identity voice score is within a threshold specific to the user profile, authenticating the user as the asserted identity.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the challenge sentence is generated to maximize speaker discriminatory ability of the spoken challenge sentence while minimizing a length of the spoken challenge sentence.
  - 3. The method of claim 1, wherein prompting the user to speak the challenge sentence occurs via one of a text display and a text-to-speech voice.
  - 4. The method of claim 1, wherein the authenticating of the user is performed as part of a multi-platform automatic speech recognition engine.
  - 5. The method of claim 1, wherein the authenticating of the user is performed by a cloud-based speech engine shared over multiple speech service applications.
  - 6. The method of claim 1, wherein the voice characteristics are personal level voice characteristics.
  - 7. The method of claim 1, wherein the voice characteristics are general level voice characteristics.
  - 8. The method of claim 1, wherein the challenge sentence is generated according to the grammar to sound semantically correct without conveying meaningful semantic information.
  - 9. The method of claim 1, wherein the challenge sentence is constrained to a maximum length based on an average user memory span.
  - 10. The method of claim 1, wherein generating the challenge sentence further comprises:
    - identifying a slot in a sentence framework;
      
      retrieving a word for the slot based on a personalized lexicon for the user; and
      
      inserting the word into the slot as part of the challenge sentence.

11. A system comprising:
- a processor; and
  
  a computer-readable storage medium having instructions stored which, when executed by the processor, result in the processor performing operations comprising;
  
  receiving a username associated with an asserted identity from a user;
  
  based on the username, accessing a user profile comprising voice characteristics;
  
  identifying, using the voice characteristics, a plurality of asserted identity-specific more valuable phonemes determined to be more valuable than second phonemes for verifying the asserted identity;
  
  generating a challenge sentence, based on the voice characteristics, wherein the challenge sentence is generated randomly according to one of a rule and a grammar, and wherein the challenge sentence comprises the plurality of asserted identity-specific more valuable phonemes;
  
  prompting the user to speak the challenge sentence to yield a spoken challenge sentence;
  
  comparing voice characteristics of the spoken challenge sentence to the voice characteristics of the user profile to yield an asserted identity voice score;
  
  comparing the voice characteristics of the spoken challenge sentence with voice characteristics of a set of imposter identities to yield imposter identity voice scores; and
  
  when the asserted identity voice score is within a threshold specific to the user profile, authenticating the user as the asserted identity.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- - 12. The system of claim 11, wherein the challenge sentence is generated to maximize speaker discriminatory ability of the spoken challenge sentence while minimizing a length of the spoken challenge sentence.
  - 13. The system of claim 11, wherein prompting the user to speak the challenge sentence occurs via one of a text display and a text-to-speech voice.
  - 14. The system of claim 11, wherein the authenticating of the user is performed as part of a multi-platform automatic speech recognition engine.
  - 15. The system of claim 11, wherein the authenticating of the user is performed by a cloud-based speech engine shared over multiple speech service applications.
  - 16. The system of claim 11, wherein the voice characteristics are personal level voice characteristics.
  - 17. The system of claim 11, wherein the voice characteristics are general level voice characteristics.
  - 18. The system of claim 11, wherein the challenge sentence is generated according to the grammar to sound semantically correct without conveying meaningful semantic information.
  - 19. The system of claim 11, wherein the challenge sentence is constrained to a maximum length based on an average user memory span.

20. A computer-readable storage device having instructions stored which, when executed by a computing device, result in the computing device performing operations comprising:
- receiving a username associated with an asserted identity from a user;
  
  based on the username, accessing a user profile comprising voice characteristics;
  
  identifying, using the voice characteristics, a plurality of asserted identity-specific more valuable phonemes determined to be more valuable than second phonemes for verifying the asserted identity;
  
  generating a challenge sentence, based on the voice characteristics, wherein the challenge sentence is generated randomly according to one of a rule and a grammar, and wherein the challenge sentence comprises the plurality of asserted identity-specific more valuable phonemes;
  
  prompting the user to speak the challenge sentence to yield a spoken challenge sentence;
  
  comparing voice characteristics of the spoken challenge sentence to the voice characteristics of the user profile to yield an asserted identity voice score;
  
  comparing the voice characteristics of the spoken challenge sentence with voice characteristics of a set of imposter identities to yield imposter identity voice scores; and
  
  when the asserted identity voice score is within a threshold specific to the user profile, authenticating the user as the asserted identity.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Zeljkovic, Ilija, Mishra, Taniya, Stent, Amanda, Syrdal, Ann K., Wilpon, Jay
Primary Examiner(s)
He, Jialong

Application Number

US12/954,094
Publication Number

US 20120130714A1
Time in Patent Office

1,973 Days
Field of Search

704/246
US Class Current

1/1
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/08   Speech classification or se...

G10L 17/00   Speaker identification or v...

G10L 17/04   Training, enrolment or mode...

G10L 17/24   the user being prompted to ...

G10L 17/26   Recognition of special voic...

G10L 2015/025   Phonemes, fenemes or fenone...

System and method for generating challenge utterances for speaker verification

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

47 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for generating challenge utterances for speaker verification

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

47 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links