Automatic grammar tuning using statistical language model generation

US 8,346,555 B2
Filed: 08/22/2006
Issued: 01/01/2013
Est. Priority Date: 08/22/2006
Status: Active Grant

First Claim

Patent Images

1. A speech processing method comprising acts of:

utilizing an original speech recognition grammar in a speech recognition system to perform first speech recognition operations for a plurality of recognition instances,the original speech recognition grammar being a grammar-based language model grammar, the first speech recognition operations comprising using the original speech recognition grammar to process first audio data that represents speech utterances;

storing instance data generated based on the first speech recognition operations performed using the original speech recognition grammar;

automatically generating a replacement grammar from the stored instance data, comprising determining, based on at least in part the stored instance data, a number of times at which at least one word or phrase was recognized in the first speech recognition operations, wherein the replacement grammar is a statistical language model grammar;

selectively replacing the original speech recognition grammar in the speech recognition system with the replacement grammar; and

utilizing the replacement grammar to perform second speech recognition operations comprising processing second audio data;

generating additional instance data based on the second speech recognition operations;

tuning the replacement grammar based on the additional instance data.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention discloses a speech processing solution that utilizes an original speech recognition grammar in a speech recognition system to perform speech recognition operations for multiple recognition instances. Instance data associated with the recognition operations can be stored. A replacement grammar can be automatically generated from the stored instance data, where the replacement grammar is a statistical language model grammar. The original speech recognition grammar, which can be a grammar-based language model grammar or a statistical language model grammar, can be selectively replaced with the replacement grammar. For example when tested performance for the replacement grammar is better than that for the original grammar, the replacement grammar can replace the original grammar.

Citations

24 Claims

1. A speech processing method comprising acts of:
- utilizing an original speech recognition grammar in a speech recognition system to perform first speech recognition operations for a plurality of recognition instances,the original speech recognition grammar being a grammar-based language model grammar, the first speech recognition operations comprising using the original speech recognition grammar to process first audio data that represents speech utterances;
  
  storing instance data generated based on the first speech recognition operations performed using the original speech recognition grammar;
  
  automatically generating a replacement grammar from the stored instance data, comprising determining, based on at least in part the stored instance data, a number of times at which at least one word or phrase was recognized in the first speech recognition operations, wherein the replacement grammar is a statistical language model grammar;
  
  selectively replacing the original speech recognition grammar in the speech recognition system with the replacement grammar; and
  
  utilizing the replacement grammar to perform second speech recognition operations comprising processing second audio data;
  
  generating additional instance data based on the second speech recognition operations;
  
  tuning the replacement grammar based on the additional instance data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the original speech recognition grammar is written in a grammar format specification language selected from a group of languages consisting of a NUANCE Grammar Specification Language (GSL), a Speech Recognition Grammar Specification (SRGS) compliant language, and a JAVA Speech Grammar Format (JSGF) compliant language.
  - 3. The method of claim 1, further comprising an act of:
    - comparing a performance of the original speech recognition grammar against the replacement grammar, wherein the act of selectively replacing the original speech recognition grammar is contingent upon results of the act of comparing, and wherein replacement of the original speech recognition grammar selectively occurs when the performance of the replacement grammar favorably compares to the performance of the original speech recognition grammar.
  - 4. The method of claim 3, wherein the replacement of the original speech recognition grammar is performed automatically and dynamically when the performance of the replacement grammar favorably compares to the performance of the original speech recognition grammar.
  - 5. The method of claim 3, further comprising an act of:
    - presenting an administrator of the speech recognition system with an option to replace the original speech recognition grammar with the replacement grammar, wherein the replacement of the original speech recognition grammar is contingent upon response provided for the option.
  - 6. The method of claim 1, wherein the speech recognition system comprises a plurality of grammars, which include said original speech recognition grammar.
  - 7. The method of claim 6, wherein the original speech recognition grammar and the replacement grammar are context dependent grammars.
  - 8. The method of claim 6, wherein the original speech recognition grammar and the replacement grammar are speaker dependent grammars.
  - 9. The method of claim 1, wherein the original speech recognition grammar and the replacement grammar are context-independent grammars.
  - 10. The method of claim 1, wherein said acts of claim 1 are performed by at least one machine in accordance with at least one computer program having a plurality of code sections that are executable by the at least one machine.

11. A method comprising acts of:
- performing a first plurality of speech-to-text operations using an original speech recognition grammar in a speech recognition system, wherein the original speech recognition grammar is a grammar-based language model grammar, and wherein the first plurality of speech-to-text operations comprise using the original speech recognition grammar to process first audio data that represents speech utterances;
  
  recording recognition instance data generated based on the first plurality of speech-to-text operations performed using the original speech recognition grammar;
  
  automatically creating a set of words and phrases from the recorded recognition instance data;
  
  automatically generating a replacement grammar from the set of words and phrases, comprising determining, based at least in part on the recorded recognition instance data, a number of times at which at least one word or phrase of the set of words and phrases was recognized in the first plurality of speech-to-text operations, wherein the replacement grammar is a statistical language model grammar;
  
  generating additional instance data based on the second speech recognition operations;
  
  tuning the replacement grammar based on the additional instance data.
- View Dependent Claims (12, 13)
- - 12. The method of claim 11, further comprising an act of:
    - automatically weighing the words and phrases based at least in part on the determined number of times at which the at least one word or phrase was recognized in the first plurality of speech-to-text operations.
  - 13. The method of claim 11, further comprising acts of:
    - executing replacement speech-to-text operations based upon performance testing input using the replacement grammar;
      
      generating replacement performance metrics based upon the replacement speech-to-text operations;
      
      generating original performance metrics based upon the performance testing input using the original speech recognition grammar; and
      
      comparing the replacement performance metrics and the original performance metrics, wherein the act of replacing is contingent upon results of the act of comparing, wherein the act of replacing selectively occurs when the replacement performance metrics favorably compare to the original performance metrics.

14. A speech recognition system comprising:
- a language model processor configured to utilize an original speech recognition grammar in performing first speech recognition operations comprising using the original speech recognition grammar to process first audio data that represents speech utterances, the original speech recognition grammar being a grammar-based language model grammar;
  
  a log data store configured to store speech instance data generated based on the first speech recognition operations performed using the original speech recognition grammar;
  
  a statistical language model generator configured to automatically generate a replacement grammar from the speech instance data at least in part by determining, based at least in part on the speech instance data, a number of times at which at least one word or phrase was recognized in the first speech recognition operations; and
  
  a grammar swapper configured to selectively replace the original speech recognition grammar with the speech replacement grammar, wherein the language model processor is further configured to utilize the replacement grammar to perform second speech recognition operations comprising processing second audio data;
  
  generating additional instance data based on the second speech recognition operations;
  
  tuning the replacement grammar based on the additional instance data.
- View Dependent Claims (15, 16, 17)
- - 15. The system of claim 14, wherein the original speech recognition grammar is written in a grammar format specification language selected from a group of languages consisting of a NUANCE Grammar Specification Language (GSL), a Speech Recognition Grammar Specification (SRGS) compliant language, and a JAVA Speech Grammar Format (JSGF) compliant language.
  - 16. The system of claim 14, wherein the original speech recognition grammar and the replacement grammar are context dependent grammars.
  - 17. The system of claim 14, further comprising:
    - a performance analyzer configured to compare a performance of the original speech recognition grammar with a performance of the replacement grammar, wherein actions taken by the grammar swapper are contingent upon results of comparisons performed by the performance analyzer.

18. At least one computer readable recording non-transitory medium having encoded thereon instructions that, when executed by at least one processor, perform a speech processing method comprising acts of:
- utilizing an original speech recognition grammar in a speech recognition system to perform first speech recognition operations for a plurality of recognition instances, the original speech recognition grammar being a grammar-based language model grammar, the first speech recognition operations comprising using the original speech recognition grammar to process first audio data that represents speech utterances;
  
  storing instance data generated based on first speech recognition operations performed using the original speech recognition grammar;
  
  automatically generating a replacement grammar from the stored instance data, comprising determining, based at least in part on the stored instance data, a number of times at which at least one word or phrase was recognized in the first speech recognition operations, wherein the replacement grammar is a statistical language model grammar;
  
  selectively replacing the original speech recognition grammar in the speech recognition system with the replacement grammar; and
  
  utilizing the replacement grammar to perform second speech recognition operations comprising processing second audio data;
  
  generating additional instance data based on the second speech recognition operations;
  
  tuning the replacement grammar based on the additional instance data.
- View Dependent Claims (19, 20, 21)
- - 19. The at least one computer readable recording non-transitory medium of claim 18, wherein the act of automatically generating a replacement grammar comprises acts of:
    - automatically creating a set of words and phrases from the stored instance data; and
      
      automatically generating the replacement grammar from the set of words and phrases.
  - 20. The at least one computer readable recording non-transitory medium of claim 19, further comprising an act of:
    - automatically weighing the words and phrases based at least in part on the determined number of times at which the at least one word or phrase was recognized in the first speech recognition operations.
  - 21. The at least one computer readable recording non-transitory medium of claim 19, further comprising acts of:
    - executing replacement speech recognition operations based upon performance testing input using the replacement grammar;
      
      generating replacement performance metrics based upon the replacement speech recognition operations;
      
      generating original performance metrics based upon the performance testing input using the original speech recognition grammar; and
      
      comparing the replacement performance metrics and the original performance metrics, wherein the act of selectively replacing the original speech recognition grammar is contingent upon results of the act of comparing, wherein replacement of the original speech recognition grammar selectively occurs when the replacement performance metrics favorably compare to the original performance metrics.

22. A speech processing method comprising acts:
- utilizing an original speech recognition grammar in a speech recognition system to perform speech recognition operations for a plurality of recognition instances, the original speech recognition grammar being a grammar-based language model grammar, the speech recognition operations comprising using the original speech recognition grammar to process audio data that represents speech utterances and was not used in generating the original speech recognition grammar;
  
  storing instance data generated based on the speech recognition operations performed using the original speech recognition grammar;
  
  automatically generating a replacement grammar from the stored instance data, comprising determining, based at least in part on the stored instance data, a number of times at which at least one word or phrase was recognized in the speech recognition operations, wherein the replacement grammar is a statistical language model grammar;
  
  and selectively replacing the original speech recognition grammar in the speech recognition system with the replacement grammar;
  
  generating additional instance data based on the second speech recognition operations;
  
  tuning the replacement grammar based on the additional instance data.

23. A speech recognition system comprising:
- a language model processor configured to utilize an original speech recognition grammar in performing speech recognition operations comprising using the original speech recognition grammar to process audio data that represents speech utterances and was not used in generating the original speech recognition grammar, the original speech recognition grammar being a grammar-based language model grammar;
  
  a log data store configured to store speech instance data generated based on the speech recognition operations performed using the original speech recognition grammar;
  
  a statistical language model generator configured to automatically generate a replacement grammar from the speech instance data at least in part by determining, based at least in part on the speech instance data, a number of times at which at least one word or phrase was recognized in the speech recognition operations; and
  
  a grammar swapper configured to selectively replace the original speech recognition grammar with the speech replacement grammar;
  
  generating additional instance data based on the second speech recognition operations;
  
  tuning the replacement grammar based on the additional instance data.

24. At least one computer readable recording non-transitory medium having encoded thereon instructions that, when executed by at least one processor, perform speech processing method comprising acts of:
- utilizing an original speech recognition grammar in a speech recognition system to perform speech recognition operations for a plurality of recognition instances, the original speech recognition grammar being a grammar-based language model grammar, the speech recognition operations comprising using the original speech recognition grammar to process audio data that represents speech utterances and was not used in generating the original speech recognition grammar;
  
  storing instance data generated based on the speech recognition operations performed using the original speech recognition grammar;
  
  automatically generating a replacement grammar from the stored instance data, comprising determining, based at least in part on the stored instance data, a number of times at which at least one word or phrase was recognized in the speech recognition operations, wherein the replacement grammar is a statistical language model grammar;
  
  and selectively replacing the original speech recognition grammar in the speech recognition system with the replacement grammar;
  
  generating additional instance data based on the second speech recognition operations;
  
  tuning the replacement grammar based on the additional instance data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Metz, Brent D.
Primary Examiner(s)
Smits, Talivaldis Ivars
Assistant Examiner(s)
Kazeminezhad, Farzad

Application Number

US11/466,223
Publication Number

US 20080052076A1
Time in Patent Office

2,324 Days
Field of Search

704/9, 704/240, 704/257, 704/235, 704/244, 379/88.22
US Class Current

704/257
CPC Class Codes

G10L 15/183 using context dependencies,...

G10L 15/197 Probabilistic grammars, e.g...

Automatic grammar tuning using statistical language model generation

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Automatic grammar tuning using statistical language model generation

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links