Training speaker-dependent, phrase-based speech grammars using an unsupervised automated technique

US 20050261901A1
Filed: 05/19/2004
Published: 11/24/2005
Est. Priority Date: 05/19/2004
Status: Active Grant

First Claim

Patent Images

1. A method for tuning grammar option weights of a phrase-based, automatic speech recognition (ASR) grammar in an unsupervised fashion comprising the steps of:

recording feedback of ASR phrase processing operations during a communication session, wherein each ASR phrase processing operation matches a spoken utterance against at least one entry within a speaker dependent, phrase-based grammar, said grammar having a plurality of grammar option weights, wherein the grammar option weights affect which entries are matched to the spoken utterances;

for each of the ASR phase processing operations, determining whether the phrase processing operation was successfully performed based upon the feedback; and

for each of the ASR phase processing operations, automatically adjusting at least one of the grammar option weights based upon results of the determining step to improve accuracy of the grammar.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention can include a method for tuning grammar option weights of a phrase-based, automatic speech recognition (ASR) grammar, where the grammar option weights affect which entries within the grammar are matched to spoken utterances. The tuning can occur in an unsupervised fashion, meaning no special training session or manual transcription of data from an ASR session is needed. The method can include the step of selecting a phrase-based grammar to use in a communication session with a user wherein different phrase-based grammars can be selected for different users. Feedback of ASR phrase processing operations can be recorded during the communication session. Each ASR phrase processing operation can match a spoken utterance against at least one entry within the selected phrase-based grammar. At least one of the grammar option weights can be automatically adjusted based upon the feedback to improve accuracy of the phrase-based grammar.

Citations

20 Claims

1. A method for tuning grammar option weights of a phrase-based, automatic speech recognition (ASR) grammar in an unsupervised fashion comprising the steps of:
- recording feedback of ASR phrase processing operations during a communication session, wherein each ASR phrase processing operation matches a spoken utterance against at least one entry within a speaker dependent, phrase-based grammar, said grammar having a plurality of grammar option weights, wherein the grammar option weights affect which entries are matched to the spoken utterances;
  
  for each of the ASR phase processing operations, determining whether the phrase processing operation was successfully performed based upon the feedback; and
  
  for each of the ASR phase processing operations, automatically adjusting at least one of the grammar option weights based upon results of the determining step to improve accuracy of the grammar.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 12, 13)
- - 2. The method of claim 1, said method further comprising the step of:
    - selecting one of a plurality of grammars as the grammar based upon an identify of a user that provides the utterances, wherein the selected grammar is utilized exclusively for ASR processing operations involving the user.
  - 3. The method of claim 2, wherein each of said adjusting steps occurs proximate in time and occurs responsive to the ending of the communication session.
  - 4. The method of claim 1, said method further comprising the steps of:
    - identifying vocal characteristics for a user that provides the utterances; and
      
      selecting one of a plurality of grammars as the grammar based upon the vocal characteristics, wherein the selected grammar is utilized by a plurality of different users, each user having the identified vocal characteristics.
  - 5. The method of claim 4, wherein said method is performed periodically in batch, where a batch adjusts grammar option weights for the grammar using feedback recorded during a plurality of communication sessions.
  - 6. The method of claim 1, said adjusting step further comprising:
    - when feedback for the ASR phrase processing operation is positive, adjusting the grammar option weight to increase a likelihood of matching entries in the grammar that are associated with the grammar option weight; and
      
      when feedback for the ASR phrase processing operation is negative, adjusting the grammar option weight to decrease a likelihood of matching entries in the grammar that are associated with the grammar option weight.
  - 7. The method of claim 6, wherein the feedback for at least one ASR phrase processing operation includes at least a portion of an n-best list of phrases, wherein said adjusting step adjusts a plurality of grammar option weights.
  - 8. The method of claim 7, wherein each entry in the n-best list is associated with a score, said method further comprising the steps of:
    - statistically analyzing the scores associated with ordered entries in the n-best list to determine a break point between entries; and
      
      for each entry up to the break point, adjusting a grammar option weight associated with the entry.
  - 9. The method of claim 1, wherein the grammar option weights are context-dependent.
  - 12. The machine-readable storage of claim 11, further causing the machine to perform the steps of:
    - identifying when one of the individual utterances has been incorrectly matched based upon the feedback; and
      
      responsive to said identifying step, adjusting at least one parameter within the identified phrase-based grammar so that the likelihood score associated with the topmost entry in the n-best list is decreased when the ASR computer program next processes an utterance similar to the incorrectly identified utterance in a session involving the identified phrase-based grammar.
  - 13. The machine-readable storage of claim 12, further causing the machine to perform the steps of:
    - determining at least one entry in the n-best list having a likelihood score that is statistically close to the likelihood score associated with the topmost entry; and
      
      responsive to said determining step, adjusting at least one parameter within the identified phrase-based grammar so that each likelihood score associated with each entry determined to be statistically close to the topmost entry is decreased when the ASR computer program next processes an utterance similar to the incorrectly identified utterance in a session involving the identified phrase-based grammar.

10. A machine-readable storage having stored thereon, an automatic speech recognition (ASR) computer program having a plurality of code sections, said code sections executable by a machine for causing the machine to perform the steps of:
- identifying a phrase-based grammar to use in a communication session with a user, wherein different phrase-based grammars are used for different users;
  
  recording feedback of ASR phrase processing operations during the communication sessions wherein each ASR phrase processing operation matches a spoken utterance against at least one entry within the identified phrase-based grammar said phrase-based grammar having a plurality of grammar option weights, wherein the grammar option weights affect which entries are matched to the spoken utterances; and
  
  automatically adjusting at least one of the grammar option weights based upon the feedback to improve accuracy of the identified phrase-based grammar.
- View Dependent Claims (11, 14, 15)
- - 11. The machine-readable storage of claim 10, wherein the feedback includes at least part of an n-best list of ASR matched entries associated with individual utterances processed during the communication session, each ASR matched entry having an associated likelihood score.
  - 14. The machine-readable storage of claim 11, further causing the machine to perform the steps of:
    - identifying when one of the individual utterances has been correctly matched based upon the feedback; and
      
      responsive to said identifying step, adjusting a parameter within the identified phrase-based grammar so that the likelihood score associated with the topmost entry in the n-best list is increased when the ASR computer program next processes an utterance similar to the correctly identified phrase in a session involving the identified phrase-based grammar.
  - 15. The machine-readable storage of claim 14, further causing the machine to perform the steps of:
    - determining at least one entry in the n-best list having a likelihood score that is statistically close to the likelihood score associated with the topmost entry; and
      
      responsive to said determining step, adjusting at least one parameter within the identified phrase-based grammar so that each likelihood score associated with each entry determined to be statistically close to the topmost entry is increased when the ASR computer program next processes an utterance similar to the correctly identified utterance in a session involving the identified phrase-based grammar.

16. An automatic speech recognition (ASR) system comprising:
- an identification unit configured to match a speaker to a speaker-dependent ASR grammar;
  
  an information collection unit configured to gather feedback in real-time concerning whether a plurality of utterances have been correctly processed by the ASR grammar during an ASR session involving the speaker; and
  
  a logic unit configured to utilize said feedback to tune the ASR grammar, wherein when an utterance has been correctly processed, at least one parameter in the ASR grammar is adjusted to increase a likelihood that the ASR system processes phrases in a similar fashion in future ASR operations involving the ASR grammar, and when an utterance has been incorrectly processed, at least one parameter in the ASR grammar is adjusted to decrease a likelihood that the ASR system processes phrases in a similar fashion in future ASR operations involving the ASR grammar.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The system of claim 16, wherein the feedback gathered by the information collection unit for each ASR processed phrase comprises:
    - a plurality of possible matching entries determined by the ASR system; and
      
      for each possible matching entry, a likelihood score that indicates the likelihood of the associated possible matching phrase being an accurate textual representation of an utterance.
  - 18. The system of claim 16, wherein the logic unit adjusts the ASR grammar to affect a plurality of possible matching entries responsive to a single ASR processed utterance.
  - 19. The system of claim 16, wherein each entry in the ASR grammar has at least one associated grammar option weight, wherein the logic unit adjusts the grammar option weights.
  - 20. The system of claim 19, wherein said a plurality of different grammar option weights can be associated with a single entry in the ASR grammar, each of the different grammar option weights corresponding to a particular context.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Davis, Brent L., Jaiswal, Peeyush, Wang, Fang

Granted Patent

US 7,778,830 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G10L 15/183 using context dependencies,...

G10L 15/19 Grammatical context, e.g. d...

Training speaker-dependent, phrase-based speech grammars using an unsupervised automated technique

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Training speaker-dependent, phrase-based speech grammars using an unsupervised automated technique

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links