In-the-field adaptation of a large vocabulary automatic speech recognizer (ASR)
DCFirst Claim
Patent Images
1. A method of improving the recognition accuracy of a speech recognizer comprising the steps of:
- deploying the speech recognizer in an environment to receive live input data;
receiving the live input data and an original speech signal;
without supervision, selecting at least one adaptation algorithm from a plurality of adaptation algorithms, and applying the selected adaptation algorithm to the received live input data, said live input data and original speech signal being in the form of speech data required for executing the adaptation algorithm, as it is being recognized to improve at least one application-specific feature for the recognition accuracy of the speech recognizer; and
redeploying the adapted speech recognizer in the target environment.
3 Assignments
Litigations
0 Petitions
Accused Products
Abstract
A technique for improving the recognition accuracy of a speech recognizer includes deploying the speech recognizer, wherein live input data is received by the recognizer as an input for a given speaker independent adaptation algorithm associated with the speech recognizer. The algorithm enhances the accuracy of the speech recognizer without human supervision. This technique is particularly suitable for adapting a large vocabulary ASR engine.
18 Citations
16 Claims
-
1. A method of improving the recognition accuracy of a speech recognizer comprising the steps of:
-
deploying the speech recognizer in an environment to receive live input data; receiving the live input data and an original speech signal; without supervision, selecting at least one adaptation algorithm from a plurality of adaptation algorithms, and applying the selected adaptation algorithm to the received live input data, said live input data and original speech signal being in the form of speech data required for executing the adaptation algorithm, as it is being recognized to improve at least one application-specific feature for the recognition accuracy of the speech recognizer; and redeploying the adapted speech recognizer in the target environment.
-
-
2. The method as described in claim 1 wherein the live input data includes digitally-encoded speech waveform samples.
-
3. The method as described in claim 1 wherein the live input data includes a processed version of given speech waveform samples, wherein the processed version is not capable of being recognized by a human listener yet is sufficient for use as input to the given adaptation algorithm.
-
4. The method as described in claim 1 wherein the live input data and associated recognition responses are collected over a given time period.
-
5. The method as described in claim 1 wherein the adaptation algorithm is based on an acoustic model.
-
6. The method as described in claim 5 wherein the acoustic model is a Hidden Markov Model.
-
7. The method as described in claim 1 wherein the adaptation algorithm is based on a language model.
-
8. The method as described in claim 7 wherein the language model comprises Word Bigram Statistics.
-
9. The method as described in claim 1 wherein the adaptation algorithm is based on a pronunciation model.
-
10. The method as described in claim 9 wherein the pronunciation model is encoded in a phonetic transcription lexicon.
-
11. The method as described in claim 1 wherein the adaptation algorithm is based on search parameters of a recognition algorithm of the speech recognizer.
-
12. The method as described in claim 11 wherein the speaker-independent adaptation algorithm is selected from the group of models consisting essentially of acoustic models, language models, pronunciation models, search parameters, and combinations thereof.
-
13. The method as described in claim 1 wherein the adaptation algorithm is based on a combination of models selected from the group consisting essentially of acoustic models, language models, pronunciation models, and search parameters of a recognition algorithm of the speech recognizer.
-
14. The method as described in claim 1 wherein the adaptation is applied as live input data is collected and recognition response to that live input data are generated.
-
15. A method of improving the recognition accuracy of a speech recognizer deployed in an environment to receive live input data, comprising the steps of:
-
receiving live input data and an original speech signal; and without supervision, selecting at least one adaptation algorithm from a plurality of adaptation algorithms, and applying a given speaker-independent adaptation algorithm to the received live input data, said live input data and original speech signal being in a form of speech data required for executing the adaptation algorithm, as it is being recognized to improve the recognition accuracy of the speech recognizer.
-
-
16. The method of claim 1, wherein the at least one application-specific feature is selected from the group consisting of channel characteristics, dialects, pronunciation idiosyncrasies and speaking style.
Specification