Recognition unit model training based on competing word and word string models
First Claim
Patent Images
1. A method of making a speech recognizer recognition unit model database based on one or more known speech signals and a set of current recognizer recognition unit models, the method comprising the steps of:
- receiving a known speech signal;
generating a first recognizer scoring signal based on the known speech signal and a current recognition unit model for that signal;
generating one or more other recognizer scoring signals, each such scoring signal based on the known speech signal and another current recognition unit model;
generating a misrecognition signal based on the first and other recognizer scoring signals;
based on a value of a predetermined loss function when applied to the misrecognition signal and the known speech signal, modifying one or more of the current recognition unit models to decrease the likelihood of misrecognizing an unknown speech signal; and
storing one or more modified recognition unit models in memory.
8 Assignments
0 Petitions
Accused Products
Abstract
A system pattern-based speech recognition, e.g., a hidden Markov model (HMM) based speech recognizer using Viterbi scoring. The principle of minimum recognition error rate is applied by the present invention using discriminative training. Various issues related to the special structure of HMMs are presented. Parameter update expressions for HMMs are provided.
-
Citations
23 Claims
-
1. A method of making a speech recognizer recognition unit model database based on one or more known speech signals and a set of current recognizer recognition unit models, the method comprising the steps of:
-
receiving a known speech signal; generating a first recognizer scoring signal based on the known speech signal and a current recognition unit model for that signal; generating one or more other recognizer scoring signals, each such scoring signal based on the known speech signal and another current recognition unit model; generating a misrecognition signal based on the first and other recognizer scoring signals; based on a value of a predetermined loss function when applied to the misrecognition signal and the known speech signal, modifying one or more of the current recognition unit models to decrease the likelihood of misrecognizing an unknown speech signal; and storing one or more modified recognition unit models in memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A speech recognizer trainer for providing a speech recognizer database based on one or more known speech signals and a set of current recognition unit models, the trainer comprising:
-
means for generating a first recognizer scoring signal based on the known speech signal and a current recognition unit model for that signal; means, coupled to the means for generating a first recognizer scoring signal, for generating one or more other recognizer scoring signals, each such scoring signals based on the known speech signal and another current recognition unit model; means, coupled to the means for generating a first and other recognizer scoring signals, for generating a misrecognition signal based on the first and other recognizer scoring signals; means, coupled to the means for generating a misrecognition signal, for modifying one or more of the recognition unit models, based on a value of a predetermined loss function when applied to the misrecognition signal and the known speech signal, to decrease the likelihood of misrecognizing an unknown speech signal; and means, coupled to the means for modifying, for storing one or more modified recognition unit models. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
-
-
23. A speech recognition system comprising
a. a feature extractor for receiving an unknown speech signal and identifying features characterizing the signal; -
b. a first memory means for storing current recognition unit models; c. a second memory means for storing known speech training samples; d. a scoring comparator, coupled to the feature extractor and the first memory means, for comparing a plurality of current recognition unit models with one or more features of the unknown speech signal to determine a comparison score for each such model; e. a score processor, coupled to the scoring comparator, for selecting the highest comparison score and recognizing speech based on the highest score; and f. a trainer, coupled to the first and second memory means, the trainer comprising; i. means for generating a first recognizer scoring signal based on a known speech signal and a current recognition unit model for that signal; ii. means, coupled to the means for generating a first recognizer scoring signal, for generating one or more other recognizer scoring signals, each such scoring signal based on the known speech signal and another current recognition unit model; iii. means, coupled to the means for generating a first and other recognizer scoring signals, for generating a misrecognition signal based on the first and other recognizer scoring signals; iv. means, coupled to the means for generating a misrecognition signal, for modifying one or more of the current recognition unit models, based on a value of a predetermined loss function when applied to the misrecognition signal and the known speech signal, to decrease the likelihood of misrecognizing an unknown speech signal; and v. means, coupled to the means for modifying, for storing one or more modified recognition unit models in the first memory means.
-
Specification