Retraining and updating speech models for speech recognition
First Claim
1. A method of updating speech models for speech recognition, comprising the steps of:
- identifying speech data for a predetermined set of utterances from a class of users, said utterances differing from a predetermined set of stored speech models by at least a predetermined amount;
collecting said identified speech data for similar utterances from said class of users;
correcting said predetermined set of stored speech models as a function of the collected speech data so that the corrected speech models are an improved match to said utterances than said predetermined set of stored speech models; and
updating said predetermined set of speech models with said corrected speech models for subsequent speech recognition of utterances from said class of users.
1 Assignment
0 Petitions
Accused Products
Abstract
A technique is provided for updating speech models for speech recognition by identifying, from a class of users, speech data for a predetermined set of utterances that differ from a set of stored speech models by at least a predetermined amount. The identified speech data for similar utterances from the class of users is collected and used to correct the set of stored speech models. As a result, the corrected speech models are a closer match to the utterances than were the set of stored speech models. The set of speech models are subsequently updated with the corrected speech models to provide improved speech recognition of utterances from the class of users. For example, the corrected speech models may be processed and stored at a central database and returned, via a suitable communications channel (e.g. the Internet) to individual user sites to update the speech recognition apparatus at those sites.
159 Citations
44 Claims
-
1. A method of updating speech models for speech recognition, comprising the steps of:
-
identifying speech data for a predetermined set of utterances from a class of users, said utterances differing from a predetermined set of stored speech models by at least a predetermined amount;
collecting said identified speech data for similar utterances from said class of users;
correcting said predetermined set of stored speech models as a function of the collected speech data so that the corrected speech models are an improved match to said utterances than said predetermined set of stored speech models; and
updating said predetermined set of speech models with said corrected speech models for subsequent speech recognition of utterances from said class of users. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of building speech models for recognizing speech of users of a particular class, comprising the steps of:
-
registering users in accordance with predetermined criteria that characterize the speech of said particular class of users;
collecting a set of registration utterances from a user;
determining a best match of each said utterance to a stored speech model;
collecting utterances from users of said particular class that differ from said stored, best match speech model by at least a predetermined amount; and
retraining said stored speech model to reduce to less than said predetermined amount, the difference between the retrained speech model and said identified utterances from said users of said particular class. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 25, 26)
-
-
23. A method of creating speech models for speech recognition, comprising the steps of:
-
registering users in accordance with predetermined criteria that characterize the speech of a particular class of users;
generating digital representations of utterances from said users;
collecting from said particular class of users those digital representations of similar utterances that differ by at least a predetermined amount from a set of stored speech models that are determined to be a best match to said utterances, and collecting corrections to said set of stored speech models that reduce the differences between an utterance and said set of models to a minimum;
building a set of updated speech models based on said collected corrections when the number of utterances that differ from said stored best match set of speech models by at least said predetermined amount, exceeds a threshold; and
using said set of updated speech models as said stored set of speech models for further speech recognition. - View Dependent Claims (27, 31, 35, 39, 43)
-
-
24. A system for updating speech models for speech recognition, comprising:
-
plural user processors each programmed to;
identify acoustic subword data for a predetermined set of utterances from a class of users, said utterances differing from a predetermined set of stored speech models by at least a predetermined amount;
collect said identified acoustic subword data for similar utterances from said class of users; and
correct said predetermined set of stored speech models as a function of the collected acoustic subword data so that the corrected speech models are a closer match to said utterances than said predetermined set of stored speech models; and
a central processor, programmed to update said predetermined set of speech models at user processors with said corrected speech models for subsequent speech recognition of utterances from said class of users. - View Dependent Claims (28, 29, 30, 33, 34, 36, 37, 38, 40, 41, 42, 44)
-
-
32. A system for building speech models for recognizing speech of users of a particular class, comprising:
-
plural user processors, each programmed to;
sense an utterance from a user;
determine a best match of said utterance to a stored speech model; and
collect data from users of said particular class utterance that differ from said stored best match speech model by at least a predetermined amount;
a central processor programmed and coupled to the plural processes for;
registering users in accordance with predetermined criteria that characterize the speech of said particular class of users; and
retraining said speech model stored at a user processor to reduce to less than said predetermined amount the difference between the retrained speech model and said identified utterances from said users of said particular class.
-
Specification