Acoustic model creation method as well as acoustic model creation apparatus and speech recognition apparatus
First Claim
1. An acoustic model creation method to create a syllabic HMM (Hidden Markov Model) which is an acoustic model, comprising:
- generating a phoneme HMM set which includes phoneme HMMs corresponding to individual phonemes;
combining the phoneme HMMs of the phoneme HMM set so as to generate an initial phoneme-connected syllable HMM set which includes initial phoneme-connected syllable HMMs corresponding to individual syllables;
training the initial phoneme-connected syllable HMM set, thereby generating a phoneme-connected syllable HMM set being the acoustic model; and
conducting a preliminary experiment for the phoneme-connected syllable HMM set by using training speech data, any misrecognized syllable and a syllable connected to the misrecognized syllable being checked using results of the preliminary experiment and syllable label data prepared in correspondence with the training speech data, a combination between a correct answer syllable for the misrecognized syllable and a syllable connected to the misrecognized syllable being extracted as a syllable connection, that a syllable-connected HMM corresponding to the syllable connection being added into the phoneme-connected syllable HMM set so as to generate an initial phoneme-connected syllable HMM/syllable-connected HMM set, and then the initial phoneme-connected syllable HMM/syllable-connected HMM set being trained using the training speech data and the syllable label data, thereby generating a phoneme-connected syllable HMM/syllable-connected HMM set being the acoustic model,the numbers of times of misrecognition of such syllable connections in the preliminary experiment results being counted, and that, a syllable-connected HMM corresponding to any syllable connection whose number of times of misrecognition is at least a preset number, among the syllable connections extracted using the preliminary experiment results, is made a candidate for addition into the phoneme-connected syllable HMM set.
1 Assignment
0 Petitions
Accused Products
Abstract
To provide an acoustic model which can absorb the fluctuation of a phonemic environment in an interval longer than a syllable, with the number of parameters of the acoustic model suppressed to be small, a phoneme-connected syllable HMM/syllable-connected HMM set is generated in such a way that a phoneme-connected syllable HMM set corresponding to individual syllables is generated by combining phoneme HMMs. A preliminary experiment is conducted using the phoneme-connected syllable HMM set and training speech data. Any misrecognized syllable and the preceding syllable of the misrecognized syllable are checked using results of a preliminary experiment syllable label data. The combination between a correct answer syllable for the misrecognized syllable and the preceding syllable of the misrecognized syllable is extracted as a syllable connection. A syllable-connected HMM corresponding to this syllable connection is added into the phoneme-connected syllable HMM set. The resulting phoneme-connected syllable HMM set is trained using the training speech data and the syllable label data.
79 Citations
14 Claims
-
1. An acoustic model creation method to create a syllabic HMM (Hidden Markov Model) which is an acoustic model, comprising:
-
generating a phoneme HMM set which includes phoneme HMMs corresponding to individual phonemes; combining the phoneme HMMs of the phoneme HMM set so as to generate an initial phoneme-connected syllable HMM set which includes initial phoneme-connected syllable HMMs corresponding to individual syllables; training the initial phoneme-connected syllable HMM set, thereby generating a phoneme-connected syllable HMM set being the acoustic model; and conducting a preliminary experiment for the phoneme-connected syllable HMM set by using training speech data, any misrecognized syllable and a syllable connected to the misrecognized syllable being checked using results of the preliminary experiment and syllable label data prepared in correspondence with the training speech data, a combination between a correct answer syllable for the misrecognized syllable and a syllable connected to the misrecognized syllable being extracted as a syllable connection, that a syllable-connected HMM corresponding to the syllable connection being added into the phoneme-connected syllable HMM set so as to generate an initial phoneme-connected syllable HMM/syllable-connected HMM set, and then the initial phoneme-connected syllable HMM/syllable-connected HMM set being trained using the training speech data and the syllable label data, thereby generating a phoneme-connected syllable HMM/syllable-connected HMM set being the acoustic model, the numbers of times of misrecognition of such syllable connections in the preliminary experiment results being counted, and that, a syllable-connected HMM corresponding to any syllable connection whose number of times of misrecognition is at least a preset number, among the syllable connections extracted using the preliminary experiment results, is made a candidate for addition into the phoneme-connected syllable HMM set. - View Dependent Claims (2, 3, 4, 5, 6, 13)
-
-
7. An acoustic model creation apparatus to create a syllable HMM (Hidden Markov Model) which is an acoustic model, comprising:
-
an initial phoneme-connected syllable HMM set generation device to combine phoneme HMMs trained in correspondence with individual phonemes, so as to generate an initial phoneme-connected syllable HMM set which includes initial phoneme-connected syllable HMM corresponding to individual syllables; and a HMM retraining device to retrain the initial phoneme-connected syllable HMM set so as to generate a phoneme-connected syllable HMM set being the acoustic model; a preliminary experiment device to conduct a preliminary experiment which uses training speech data, for a phoneme-connected syllable HMM set; a misrecognized-syllabic-part extraction device to check any misrecognized syllable and a syllable connected to the misrecognized syllable by using results of the preliminary experiment obtained by the preliminary experiment device and syllable label data prepared in correspondence with the training speech data, and to extract as a syllable connection, a combination between a correct answer syllable for the misrecognized syllable and a syllable connected to the misrecognized syllable; initial phoneme-connected syllable HMM/syllable-connected HMM set generation device to add a syllable-connected HMM which corresponds to the syllable connection extracted by the misrecognized-syllabic-part extraction device, into the phoneme-connected syllable HMM set, thereby generating an initial phoneme-connected syllable HMM/syllable-connected HMM set; the HMM retraining device to retrain the initial phoneme-connected syllable HMM/syllable-connected HMM set generated by the initial phoneme-connected syllable HMM/syllable-connected HMM set generation device, by using the training speech data and the syllable label data, thereby generating a phoneme-connected syllable HMM syllable-connected HMM set being the acoustic model; and characterized in that the misrecognized-syllabic-part extraction device counts the numbers of times of misrecognition of the syllable connections in the preliminary experiment results, and that, a syllable-connected HMM corresponding to any syllable connection whose number of times of misrecognition is at least a preset number, among the syllable connections extracted using the preliminary experiment results, is made a candidate for addition into the phoneme-connected syllable HMM set. - View Dependent Claims (8, 9, 10, 11, 12, 14)
-
Specification