Creation and Use of Application-Generic Class-Based Statistical Language Models for Automatic Speech Recognition
First Claim
1. A method comprising:
- for each of a plurality of speech applications, parsing a corpus of terms to produce a first output set, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application;
for each of the plurality of speech applications, replacing each of the grammar tags in the first output set with a class identifier of an application-generic class, to produce a second output set; and
processing collectively the second output sets or data derived from the output sets with a statistical language model (SLM) trainer to generate an application-generic class-based SLM.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of creating an application-generic class-based SLM includes, for each of a plurality of speech applications, parsing a corpus of utterance transcriptions to produce a first output set, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application. The method further includes, for each of the plurality of speech applications, replacing each of the grammar tags in the first output set with a class identifier of an application-generic class, to produce a second output set. The method further includes processing the resulting second output sets with a statistical language model (SLM) trainer to generate an application-generic class-based SLM.
-
Citations
25 Claims
-
1. A method comprising:
-
for each of a plurality of speech applications, parsing a corpus of terms to produce a first output set, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application; for each of the plurality of speech applications, replacing each of the grammar tags in the first output set with a class identifier of an application-generic class, to produce a second output set; and processing collectively the second output sets or data derived from the output sets with a statistical language model (SLM) trainer to generate an application-generic class-based SLM. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method of creating a statistical language model (SLM) for automatic speech recognition (ASR), the method comprising:
-
for each of a plurality of speech applications, parsing a corpus of utterance transcriptions from the application to produce a first output set, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application, wherein said parsing includes for each identified expression, identifying a type of grammar to which the expression corresponds, including determining whether the expression corresponds to a first grammar or a second grammar, and selecting a grammar tag to replace the expression based on the identified type of grammar; for each of the plurality of speech applications, replacing each of the grammar tags in the first output set with a class identifier of an application-generic class, to produce a second output set, including replacing the grammar tag with a first class identifier if the grammar tag is determined to correspond to a grammar of the first type, and replacing the grammar tag with a second class identifier if the grammar tag is determined to a grammar of the second type; filtering the second output sets collectively based on an algorithm to produce a third output set; and processing the third output set with an SLM trainer to generate an application-generic class-based SLM for ASR, wherein the application-generic class-based SLM includes one or more of said class identifiers. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A method comprising:
-
creating an application-generic class-based statistical language model (SLM); and creating an application-specific SLM for use in automatic speech recognition for a target speech application, by incorporating into the application-generic class-based SLM an application-specific grammar for the target speech application. - View Dependent Claims (21, 22)
-
-
23. A method comprising:
-
inputting a set of audio data associated with a target speech application; executing an automatic speech recognition (ASR) process to recognize speech represented in the set of audio data, by using an application-generic class-based statistical language model (SLM) in combination with an application-specific grammar for the target speech application, to generate a set of recognition results; and processing at least a portion of the set of recognition results with an SLM trainer to generate a word-based SLM for the target speech application. - View Dependent Claims (24)
-
-
25. An automatic speech recognition system comprising:
-
an application-generic class-based statistical language model (SLM); and an automatic speech recognizer to recognize input speech represented in a set of audio data associated with a speech application, by using the application-generic class-based statistical language model (SLM) in combination with an application-specific grammar for the speech application, to generate a set of recognition results, wherein the application-specific grammar is a class, and wherein the application-generic class-based SLM includes a generic class identifier to indicate to the automatic speech recognizer where to apply the application-specific grammar in the SLM.
-
Specification