Speaker-independent model generation apparatus and speech recognition apparatus each equipped with means for splitting state having maximum increase in likelihood
First Claim
1. A speaker-independent model generation apparatus comprising:
- model generation means for generating a hidden Markov model of a single Gaussian distribution using a Baum-Welch training algorithm based on spoken speech data from a plurality of specific speakers, and thereafter for generating a speaker-independent hidden Markov model by iterations of splitting a state having a maximum increase in likelihood upon splitting one state in contextual or temporal domains on the hidden Markov model of the single Gaussian distribution.
3 Assignments
0 Petitions
Accused Products
Abstract
There is provided a speaker-independent model generation apparatus and a speech recognition apparatus which require a processing unit to have less memory capacity and which allow its computation time to be reduced, as compared with a conventional counterpart. A single Gaussian HMM is generated with a Baum-Welch training algorithm based on spoken speech data from a plurality of specific speakers. A state having a maximum increase in likelihood as a result of splitting one state in contextual or temporal domains is searched. Then, the state having a maximum increase in likelihood is split in a contextual or temporal domain corresponding to the maximum increase in likelihood. Thereafter, a single Gaussian HMM is generated with the Baum-Welch training algorithm, and these steps are iterated until the states within the single Gaussian HMM can no longer be split or until a predetermined number of splits is reached. Thus, a speaker-independent HMM is generated. Also, speech is recognized with reference to the generated speaker-independent HMM.
123 Citations
10 Claims
-
1. A speaker-independent model generation apparatus comprising:
model generation means for generating a hidden Markov model of a single Gaussian distribution using a Baum-Welch training algorithm based on spoken speech data from a plurality of specific speakers, and thereafter for generating a speaker-independent hidden Markov model by iterations of splitting a state having a maximum increase in likelihood upon splitting one state in contextual or temporal domains on the hidden Markov model of the single Gaussian distribution. - View Dependent Claims (2, 3, 4)
-
5. A speech recognition apparatus comprising:
-
model generation means for generating a hidden Markov model of a single Gaussian distribution using a Baum-Welch training algorithm based on spoken speech data from a plurality of specific speakers, and thereafter for generating a speaker-independent hidden Markov model by iterations of splitting a state having a maximum increase in likelihood upon splitting one state in contextual or temporal domains on the hidden Markov model of the single Gaussian distribution; and speech recognition means for, in response to an input speech signal of a spoken speech, recognizing the spoken speech with reference to the speaker-independent hidden Markov model generated by said model generation means.
-
-
6. A speech recognition apparatus comprising:
-
initial model generation means for generating a hidden Markov model of a single Gaussian distribution using the Baum-Welch training algorithm based on spoken speech data from a plurality of specific speakers; search means for searching a state having a maximum increase in likelihood upon splitting one state in contextual or temporal domains on the hidden Markov model of the single Gaussian distribution generated by said initial model generation means; generation means for splitting the state having the maximum increase in likelihood searched by said search means, in a contextual or temporal domain corresponding to the maximum increase in likelihood and thereafter for generating a hidden Markov model of a single Gaussian distribution using the Baum-Welch training algorithm; control means for generating a speaker-independent hidden Markov model by iterating a process of said search means and a process of said generation means until at least one of the following conditions is satisfied; (a) the states within the hidden Markov model of the single Gaussian distribution can no longer be split; and (b) a number of states within the hidden Markov model of the single Gaussian distribution reaches a predetermined number of splits; and speech recognition means for, in response to an input speech signal of a spoken speech, recognizing the spoken speech with reference to the speaker-independent hidden Markov model generated by said control means. - View Dependent Claims (7, 8)
-
-
9. A method for generating a speaker-independent model, including the following steps:
-
generating a hidden Markov model of a single Gaussian distribution using a Baum-Welch training algorithm based on spoken speech data from a plurality of specific speakers; and thereafter, generating a speaker-independent hidden Markov model by iterations of splitting a state having a maximum increase in likelihood upon splitting cone state in contextual or temporal domains on the hidden Markov model of the single Gaussian distribution.
-
-
10. A method for generating a speaker-independent model, including the following steps:
-
generating an initial hidden Markov model of a single Gaussian distribution using the Baum-Welch training algorithm based on spoken speech data from a plurality of specific speakers; searching a state having a maximum increase in likelihood upon splitting one state in contextual or temporal domains on the generated initial hidden Markov model of the single Gaussian distribution; splitting the searched state having the maximum increase in likelihood, in a contextual or temporal domain corresponding to the maximum increase in likelihood, and thereafter, generating a hidden Markov model of a single Gaussian distribution using the Baum-Welch training algorithm; and generating a speaker-independent hidden Markov model by iterating said searching step and said splitting and generating step until at least one of the following conditions is satisfied; (a) the states within the hidden Markov model of the single Gaussian distribution can no longer be split; and (b) a number of states within the hidden Markov model of the single Gaussian distribution reaches a predetermined number of splits.
-
Specification