Creation and use of application-generic class-based statistical language models for automatic speech recognition

US 8,135,578 B2
Filed: 08/24/2007
Issued: 03/13/2012
Est. Priority Date: 08/24/2007
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

accessing a corpus of terms by a parser using a processor in each of a plurality of speech applications;

parsing, using said parser and said processor, said corpus of terms in each speech application to produce a plurality of first output sets, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application, wherein said grammar tags are selected from among command grammar tags and collection grammar tags;

accessing said plurality of first output sets by a class-relabeler and said processor;

replacing by the class-relabeler and said processor, for each of the plurality of speech applications, each of the grammar tags in the plurality of first output sets with a class identifier of an application-generic class, to produce plurality of a second output sets;

accessing said plurality of second output sets by a token selector and said processor;

processing collectively, by said token selector and said processor, the plurality of second output sets or data derived from the output sets with a statistical language model (SLM) trainer; and

generating, using said processor, an application-generic class-based SLM using a set of results from said SLM trainer.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of creating an application-generic class-based SLM includes, for each of a plurality of speech applications, parsing a corpus of utterance transcriptions to produce a first output set, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application. The method further includes, for each of the plurality of speech applications, replacing each of the grammar tags in the first output set with a class identifier of an application-generic class, to produce a second output set. The method further includes processing the resulting second output sets with a statistical language model (SLM) trainer to generate an application-generic class-based SLM.

35 Citations

View as Search Results

22 Claims

1. A method comprising:
- accessing a corpus of terms by a parser using a processor in each of a plurality of speech applications;
  
  parsing, using said parser and said processor, said corpus of terms in each speech application to produce a plurality of first output sets, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application, wherein said grammar tags are selected from among command grammar tags and collection grammar tags;
  
  accessing said plurality of first output sets by a class-relabeler and said processor;
  
  replacing by the class-relabeler and said processor, for each of the plurality of speech applications, each of the grammar tags in the plurality of first output sets with a class identifier of an application-generic class, to produce plurality of a second output sets;
  
  accessing said plurality of second output sets by a token selector and said processor;
  
  processing collectively, by said token selector and said processor, the plurality of second output sets or data derived from the output sets with a statistical language model (SLM) trainer; and
  
  generating, using said processor, an application-generic class-based SLM using a set of results from said SLM trainer.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. A method as recited in claim 1, wherein the application-generic class-basedSLM includes one or more of said class identifiers.
  - 3. A method as recited in claim 2, further comprising:
    - creating, using said processor, an application-specific class-based SLM for a target speech application by replacing each said class identifier in the application-generic class-based SLM with a pointer to an application-specific grammar for the target speech application.
  - 4. A method as recited in claim 3, wherein said application-specific grammar is a class of the SLM.
  - 5. A method as recited in claim 1, wherein said parsing comprises:
    - for each identified expression, identifying, using said processor, a type of grammar to which the expression corresponds; and
      
      selecting a grammar tag to replace the expression based on the identified type of grammar.
  - 6. A method as recited in claim 5, wherein said identifying a type of grammar comprises determining, using said processor, whether the expression corresponds to a command grammar or a collection grammar.
  - 7. A method as recited in claim 6, wherein said replacing each of the grammar tags in the first output set with a class identifier of an application generic class comprises:
    - replacing, using said processor, a grammar tag with a first class identifier if the grammar tag is determined to correspond to a first type of grammar; and
      
      replacing, using said processor, a grammar tag with a second class identifier if the grammar tag is determined to correspond to a second type of grammar.
  - 8. A method as recited in claim 7, wherein the first type of grammar is a command grammar and the second type of grammar is a collection grammar.
  - 9. A method as recited in claim 1, further comprising:
    - prior to said processing, performing on the second output sets collectively at least one operation from the set of operations consisting of;
      
      balancing, using said processor, between the second output sets according to a size of the corpus of the corresponding speech applications;
      
      filtering, using said processor, the second output sets to remove expressions that are not present in the corpus of at least a predetermined subset of the plurality of speech applications; and
      
      assigning, using said processor, weights to tokens in the second output sets.
  - 10. A method as recited in claim 1, further comprising:
    - executing, using said processor, an automatic speech recognition (ASR) process to recognize speech represented in a stored set of audio data associated with a target speech application, by using an application-specific grammar for the target;
      
      speech application in combination with the application-generic class-based SLM, to generate a set of recognition results.
  - 11. A method as recited in claim 10, further comprising:
    - processing, using said processor, at least a portion of the set of recognition results with an SLM trainer to generate a word-based SLM for use in ASR for the target speech application.
  - 12. A method as recited in claim 11, further comprising:
    - using the word-based SLM to perform ASR for the target speech application.

13. A method of creating a statistical language model (SLM) for automatic speech recognition (ASR), the method comprising:
- for each of a plurality of speech applications, parsing a corpus of utterance transcriptions, with a parser and a processor, from the application to produce a first output set, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application wherein said grammar tags are selected from among command grammar tags and collection grammar tags, wherein said parsing includes;
  
  for each identified expression, identifying, with said parser and said processor, a type of grammar to which the expression corresponds, including determining whether the expression corresponds to a first grammar or a second grammar, and selecting, with said parser, a grammar tag to replace the expression based on the identified type of grammar;
  
  for each of the plurality of speech applications, replacing, with a class relabeler and said processor, each of the grammar tags in the first output set with a class identifier of an application-generic class, to produce a second output set, including;
  
  replacing, with said class-relabeler and said processor, the grammar tag with a first class identifier if the grammar tag is determined to correspond to a grammar of the first type, andreplacing, with said class-relabeler and said processor, the grammar tag with a second class identifier if the grammar tag is determined to a grammar of the second type;
  
  filtering, with a token selector and said processor, the second output sets collectively based on an algorithm to produce a third output set; and
  
  processing the third output set with an SLM trainer and said processor; and
  
  generating an application-generic class-based SLM for ASR using a set of results from said SLM trainer and said processor, wherein the application-generic class-based SLM includes one or more of said class identifiers.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. A method as recited in claim 13, wherein the first type of grammar is a command grammar and the second type of grammar is a collection grammar.
  - 15. A method as recited in claim 14, wherein said filtering comprises at least one operation from the set of operations consisting of:
    - balancing, using said processor, between the second output sets according to a size of the corpus of the corresponding speech applications;
      
      filtering, using said processor, the second output sets to remove expressions that are not present in the corpus of at least a predetermined subset of the plurality of speech applications;
      
      assigning, using said processor, weights to tokens in the second output sets.
  - 16. A method as recited in claim 13, further comprising:
    - creating, using said processor, an application-specific class-based SLM for ASR for a target application by replacing each said class identifier in the application-generic classbased SLM with a reference to an application-specific grammar for the target application.
  - 17. A method as recited in claim 16, wherein said application-specific grammar is a class of the SLM.
  - 18. A method as recited in claim 16, further comprising generating a word-based SLM for use in ASR, for the target speech application, by:
    - executing, using said processor, an ASR process to recognize speech represented in a stored set of audio data associated with the target speech application, by using said application-specific class-based SLM to generate a set of recognition results; and
      
      processing, using said processor, at least a portion of the set of recognition results with an SLM trainer to generate the word-based SLM for use in ASR, for the target speech application.
  - 19. A method as recited in claim 18, further comprising:
    - using the word-based SLM to perform ASR for the target speech application.

20. A method comprising:
- accessing, using a processor, a corpus of terms by a parser in each of a plurality of speech applications;
  
  parsing, using said parser and said processor, said corpus of terms in each speech application to produce a plurality of first output sets, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application, wherein said grammar tags are selected from among command grammar tags and collection grammar tags;
  
  accessing, using said processor, said plurality of first output sets by a class-relabeler;
  
  replacing, using said processor, by the class-relabeler, for each of the plurality of speech applications, each of the grammar tags in the plurality of first output sets with a class identifier of an application-generic class, to produce plurality of a second output sets;
  
  accessing, using said processor, said plurality of second output sets by a token selector; and
  
  processing, using said processor, collectively, by said token selector, the plurality of second output sets or data derived from the output sets with a statistical language model (SLM) trainer;
  
  generating, using said processor, an application-generic class-based SLM using a set of results from said SLM trainer; and
  
  creating, using said processor, an application-specific SLM for use in automatic speech recognition for a target speech application, by incorporating into the application generic class-based SLM an application-specific grammar for the target speech application.
- View Dependent Claims (21, 22)
- - 21. A method as recited in claim 20, wherein the application-specific grammar is a class of the SLM.
  - 22. A method as recited in claim 21, wherein creating an application specific SLM comprises replacing, using said processor, a generic class identifier in the application generic class-based SLM with a reference to an application-specific grammar for the target speech application.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Hébert, Matthieu
Primary Examiner(s)
COLUCCI, MICHAEL C

Application Number

US11/845,015
Publication Number

US 20090055184A1
Time in Patent Office

1,663 Days
Field of Search

704/9, 704/1, 704/10, 704/231, 704/244, 704/257, 704/270.1
US Class Current

704/9
CPC Class Codes

G06F 40/205   Parsing

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/183   using context dependencies,...

G10L 15/193   Formal grammars, e.g. finit...

G10L 15/197   Probabilistic grammars, e.g...

G10L 2015/228   of application context

Creation and use of application-generic class-based statistical language models for automatic speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

35 Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Creation and use of application-generic class-based statistical language models for automatic speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

35 Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links