Adaptation of language models and context free grammar in speech recognition
First Claim
Patent Images
1. A computer-implemented system that facilitates speech recognition, comprising:
- a recognition component using a computer system for generating a recognized result based on an input phrase, the recognition component using a statistical language model (SLM) having an empirical error rate defined as an error rate accumulated over a set of training samples collected from at least one of actual speech input or generated pseudo samples from existing models, in order to reflect an ability of the SLM to differentiate different terms during recognition;
an interaction component for interacting with the recognized result to create a corrected result;
an adaptation component for receiving the recognized result and the corrected result and discriminatively adapting the SLM to the corrected result based on criteria to minimize the empirical error rate defined over a training corpus as an objective function, wherein the adaptation component facilitates discriminative adaptation and training of context-free grammars (CFG) to optimize the criteria; and
a processor that executes computer-executable instructions associated with at least one of the recognition component, the interaction component, or the adaptation component.
2 Assignments
0 Petitions
Accused Products
Abstract
Architecture is disclosed herewith for minimizing an empirical error rate by discriminative adaptation of a statistical language model in a dictation and/or dialog application. The architecture allows assignment of an improved weighting value to each term or phrase to reduce empirical error. Empirical errors are minimized whether a user provides correction results or not based on criteria for discriminatively adapting the user language model (LM)/context-free grammar (CFG) to the target. Moreover, algorithms are provided for the training and adaptation processes of LM/CFG parameters for criteria optimization.
-
Citations
18 Claims
-
1. A computer-implemented system that facilitates speech recognition, comprising:
-
a recognition component using a computer system for generating a recognized result based on an input phrase, the recognition component using a statistical language model (SLM) having an empirical error rate defined as an error rate accumulated over a set of training samples collected from at least one of actual speech input or generated pseudo samples from existing models, in order to reflect an ability of the SLM to differentiate different terms during recognition; an interaction component for interacting with the recognized result to create a corrected result; an adaptation component for receiving the recognized result and the corrected result and discriminatively adapting the SLM to the corrected result based on criteria to minimize the empirical error rate defined over a training corpus as an objective function, wherein the adaptation component facilitates discriminative adaptation and training of context-free grammars (CFG) to optimize the criteria; and a processor that executes computer-executable instructions associated with at least one of the recognition component, the interaction component, or the adaptation component. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-implemented method of processing speech, comprising acts of:
-
processing input terms using a computer system into first-time recognized results using a recognizer and statistical language model having an empirical error rate defined as an error rate accumulated over a set of training samples collected from at least one of actual speech input or generated pseudo samples from existing models, in order to reflect an ability of the statistical language model to differentiate different terms during recognition; generating user-corrected results based on the first-time recognized results; optimizing CFG weights to minimize the empirical error rate; discriminatively adapting the statistical language model to the user-corrected results based on criteria to minimize the empirical error rate defined over a training corpus as an objective function; generating new language model scores based on the user-corrected results and the first-time recognized results; inputting the new language model scores to a recognizer to process additional input terms; and utilizing a processor that executes instructions stored in memory to perform at least one of the acts of processing, generating, optimizing, discriminatively adapting, or inputting. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer-implemented system, comprising:
-
computer-implemented means for processing input terms into first-time recognized results using a statistical language model having an empirical error rate defined as an error rate accumulated over a set of training samples collected from at least one of actual speech input or generated pseudo samples from existing models, in order to reflect an ability of the statistical language model to differentiate different terms during recognition; computer-implemented means for generating user-corrected results based on the first-time recognized results; computer-implemented means for discriminatively training or adapting CFG parameters in a dialog application; computer-implemented means for discriminatively adapting the statistical language model to the user-corrected results based on criteria to minimize the empirical error rate defined over a training corpus as an objective function; computer-implemented means for generating new language model scores based on the user-corrected results and the first-time recognized results; computer-implemented means for inputting the new language model scores back to a recognizer to process additional input terms; and processor means that executes computer-executable instructions associated with at least one of the means for processing, generating, discriminatively adapting, or inputting.
-
Specification