Adaptation of speech models in speech recognition
First Claim
1. A computer system comprising:
- (a) a database of speech models;
(b) a speech recognition (SR) engine adapted to compare user utterances to the database of speech models to recognize the user utterances;
(c) an adaptation module adapted to modify the database of speech models based on a set of user utterances corresponding to a set of known inputs;
(d) a pronunciation evaluation module adapted to characterize user utterances relative to corresponding speech models in the database; and
(e) a sequence generator adapted to generate the set of known inputs used by the adaptation module to modify the database of speech models, wherein the sequence generator automatically selects at least a subset of the known inputs based on the characterization of previous user utterances by the pronunciation evaluation module.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer-based automatic speech recognition (ASR) system generates a sequence of text material used to train the ASR system. The system compares the sequence of text material to inputs corresponding to a user'"'"'s speech utterances of that text material in order to update the speech models (e.g., phoneme templates) used during normal ASR processing. The ASR system is able to generate a user-dependent sequence of text material for adapting the speech models, where at least some of the text material is based on the evaluation of previous user utterances. In this way, the system can be trained more efficiently by concentrating on particular speech models that are more problematic than others for the particular user (or group of users).
181 Citations
20 Claims
-
1. A computer system comprising:
-
(a) a database of speech models;
(b) a speech recognition (SR) engine adapted to compare user utterances to the database of speech models to recognize the user utterances;
(c) an adaptation module adapted to modify the database of speech models based on a set of user utterances corresponding to a set of known inputs;
(d) a pronunciation evaluation module adapted to characterize user utterances relative to corresponding speech models in the database; and
(e) a sequence generator adapted to generate the set of known inputs used by the adaptation module to modify the database of speech models, wherein the sequence generator automatically selects at least a subset of the known inputs based on the characterization of previous user utterances by the pronunciation evaluation module. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-based method for training a computer application having a speech recognition (SR) engine adapted to compare user utterances to a database of speech models to recognize the user utterances, the method comprising:
-
generating a set of known inputs;
modifying the database of speech models based on a set of user utterances corresponding to the set of known inputs; and
characterizing user utterances relative to corresponding speech models in the database, wherein at least a subset of the known inputs are automatically selected based on the characterization of previous user utterances. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A machine-readable medium, having encoded thereon program code, wherein, when the program code is executed by a machine, the machine implements a method for training a computer application having a speech recognition (SR) engine adapted to compare user utterances to a database of speech models to recognize the user utterances, the method comprising:
-
generating a set of known inputs;
modifying the database of speech models based on a set of user utterances corresponding to the set of known inputs; and
evaluating the user utterances, wherein at least a subset of the known inputs are automatically selected based on the evaluation of previous user utterances. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification