Non-interactive enrollment in speech recognition

US 6,163,768 A
Filed: 06/15/1998
Issued: 12/19/2000
Est. Priority Date: 06/15/1998
Status: Expired due to Term

First Claim

Patent Images

1. A computer-implemented method for enrolling a user in a speech recognition system, comprising:

obtaining data representing a user'"'"'s speech, the speech including multiple user utterances and generally corresponding to an enrollment text;

analyzing acoustic content of data corresponding to a user utterance;

determining, based on the analysis, whether the user utterance matches a portion of the enrollment text; and

if the user utterance matches a portion of the enrollment text, using the acoustic content of the user utterance to update acoustic models corresponding to the portion of the enrollment text,wherein determining whether the user utterance matches a portion of the enrollment text comprises a determination that the user utterance matches when the user has skipped at least one word of the portion of the enrollment text.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A computer enrolls a user in a speech recognition system by obtaining data representing a user'"'"'s speech, the speech including multiple user utterances and generally corresponding to an enrollment text, and analyzing acoustic content of data corresponding to a user utterance. The computer determines, based on the analysis, whether the user utterance matches a portion of the enrollment text. If so, the computer uses the acoustic content of the user utterance to update acoustic models corresponding to the portion of the enrollment text. The computer may determine that the user utterance matches a portion of the enrollment text even when the user has skipped or repeated words of the enrollment text.

210 Citations

31 Claims

1. A computer-implemented method for enrolling a user in a speech recognition system, comprising:
- obtaining data representing a user'"'"'s speech, the speech including multiple user utterances and generally corresponding to an enrollment text;
  
  analyzing acoustic content of data corresponding to a user utterance;
  
  determining, based on the analysis, whether the user utterance matches a portion of the enrollment text; and
  
  if the user utterance matches a portion of the enrollment text, using the acoustic content of the user utterance to update acoustic models corresponding to the portion of the enrollment text,wherein determining whether the user utterance matches a portion of the enrollment text comprises a determination that the user utterance matches when the user has skipped at least one word of the portion of the enrollment text.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 2. The method of claim 1, wherein obtaining data comprises obtaining data recorded using a recording device physically separate from the computer implementing the method.
  - 3. The method of claim 2, wherein the recording device comprises a digital recording device, and obtaining data comprises receiving a file from the digital recording device.
  - 4. The method of claim 2, further comprisingdisplaying the enrollment text on a computer display for a user to read aloud, andrecording the user'"'"'s reading of the enrollment text on the recording device.
  - 5. The method of claim 2, further comprisingproviding the enrollment text on a printed page for a user to read aloud, andrecording the user'"'"'s reading of the enrollment text on the recording device.
  - 6. The method of claim 2, wherein obtaining data comprises receiving signals generated by playing back the user'"'"'s speech using the recording device.
  - 7. The method of claim 6, wherein the recording device comprises an analog recording device.
  - 8. The method of claim 1, further comprising designating an active portion of the enrollment text, wherein analyzing acoustic content of data corresponding to a user utterance comprises analyzing the data relative to the active portion of the enrollment text.
  - 9. The method of claim 8, wherein analyzing the data relative to the active portion of the enrollment text comprises attempting to match the data to models for words included in the active portion of the enrollment text.
  - 10. The method of claim 8, wherein analyzing the data relative to the active portion of the enrollment text comprises using an enrollment grammar corresponding to the active portion of the enrollment text.
  - 11. The method of claim 8, further comprising identifying a position of a previously analyzed utterance in the enrollment text, wherein designating an active portion of the enrollment text comprises designating an active portion based on the identified position.
  - 12. The method of claim 11, wherein designating the active portion comprises designating a portion that includes text preceding and following the identified position.
  - 13. The method of claim 12, wherein designating the active portion comprises designating a portion that includes a paragraph including the identified position, a paragraph preceding the identified position, and a paragraph following the identified position.
  - 14. The method of claim 1, wherein determining whether the user utterance matches a portion of the enrollment text comprises using an enrollment grammar corresponding to the enrollment text.
  - 15. The method of claim 14, wherein determining whether the user utterance matches a portion of the enrollment text further comprises using a rejection grammar.
  - 16. The method of claim 15, wherein the rejection grammar comprises a phoneme grammar.
  - 17. The method of claim 16, wherein the rejection grammar models an utterance using a set of phonemes that is smaller than a set of phonemes used by the enrollment grammar.
  - 18. The method of claim 1, further comprising ignoring the user utterance upon determining that the user utterance does not match a portion of the enrollment text.
  - 19. The method of claim 1, wherein the at least one word of the enrollment text comprises more than one word of the enrollment text.
  - 20. The method of claim 1, wherein the determining that the user utterance matches comprises a determination that the user utterance matches despite one or more words in the user utterance not found in the portion of the enrollment text.

21. Computer software, residing on a computer-readable storage medium, comprising instructions for causing a computer to:
- obtain data representing a user'"'"'s speech, the speech including multiple user utterances and generally corresponding to an enrollment text;
  
  analyze acoustic content of data corresponding to a user utterance;
  
  determine, based on the analysis, whether the user utterance matches a portion of the enrollment text; and
  
  use the acoustic content of the user utterance to update acoustic models corresponding to a portion of the enrollment text that matches the user utterance;
  
  wherein the instructions configure the computer to determine that the user utterance matches when the user has skipped at least one word of the portion of the enrollment text.
- View Dependent Claims (22, 23)
- - 22. The method of claim 21, wherein the at least one word of the enrollment text comprises more than one word of the enrollment text.
  - 23. The method of claim 21, wherein the determining that the user utterance matches comprises a determination that the user utterance matches despite one or more words in the user utterance not found in the portion of the enrollment text.

24. A speech recognition system for enrolling a user, comprising:
- an input device for receiving speech signals; and
  
  a processor configured to;
  
  obtain data representing a user'"'"'s speech, the speech including multiple user utterances and generally corresponding to an enrollment text;
  
  analyze acoustic content of data corresponding to a user utterance;
  
  determine, based on the analysis, whether the user utterance matches a portion of the enrollment text;
  
  use the acoustic content of the user utterance to update acoustic models corresponding to a portion of the enrollment text that matches the user utterance; and
  
  determine that the user utterance matches when the user has skipped at least one word of the portion of the enrollment text.
- View Dependent Claims (25, 26)
- - 25. The method of claim 24, wherein the at least one word of the enrollment text comprises more than one word of the enrollment text.
  - 26. The method of claim 24, wherein the determining that the user utterance matches comprises a determination that the user utterance matches despite one or more words in the user utterance not found in the portion of the enrollment text.

27. A computer-implemented method for enrolling a user in a speech recognition system, comprising:
- obtaining data representing a user'"'"'s speech from a recording device physically separate from the computer implementing the method, the speech including multiple user utterances and generally corresponding to an enrollment text;
  
  analyzing acoustic content of the obtained data corresponding to a user utterance to identify a sequence of words in the user utterance;
  
  determining, using the sequence of words, whether the user utterance matches a portion of the enrollment text; and
  
  if the user utterance matches a portion of the enrollment text, using the acoustic content of the user utterance to update acoustic models corresponding to the portion of the enrollment text,wherein determining whether the user utterance matches a portion of the enrollment text comprises a determination that the user utterance matches when the sequence of words in the user utterance is different than a sequence of words in the portion of the enrollment text.
- View Dependent Claims (28, 29, 30, 31)
- - 28. The method of claim 27, wherein the determination that the user utterance matches comprises a determination that words found both in the user utterance and in the portion of,the enrollment text occur in the same order.
  - 29. The method of claim 27, wherein the sequence of words in the user utterance comprises at least one word not present in the portion of the enrollment text.
  - 30. The method of claim 27, wherein the words in the portion of the enrollment text comprise at least one word not present in the sequence of words in the user utterance.
  - 31. The method of claim 27, further comprising selecting the portion of the enrollment text from the enrollment text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Dragon Systems, Inc. (Microsoft Corporation)
Inventors
Gould, Joel M., Gold, Allan, Albina, Toffee A., Sherwood, Stefan, Parmenter, David Wilsberg
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Azad, Abul K.

Application Number

US09/094,609
Time in Patent Office

918 Days
Field of Search

704/254, 704/255, 704/256, 704/257, 704/243, 704/244, 704/235, 704/251, 704/252, 704/260
US Class Current

704/235
CPC Class Codes

G10L 15/063 Training

G10L 2015/0638 Interactive procedures

Non-interactive enrollment in speech recognition

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

210 Citations

31 Claims

Specification

Use Cases

Quick Links

Others

Non-interactive enrollment in speech recognition

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

210 Citations

31 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others