Single distribution and mixed distribution model conversion in speech recognition method, apparatus, and computer readable medium
First Claim
1. A speech recognition method using hidden Markov models, comprising the steps of:
- converting low noise mixed distribution Markov models into single distribution models;
combining a noise model with the single distribution models to generate noise superimposed speech models;
converting at least some of the noise superimposed speech models to noise adapted mixed distribution models that retain the relationship of parameters in low noise mixed distribution Markov models;
calculating a first set of likelihoods that an input speech utterance corresponds to the noise adapted mixed distribution models;
selecting at least one noise adapted mixed distribution model at least partially based on the first set of likelihoods; and
associating the input speech utterance with the low noise mixed distribution Markov models corresponding to the noise adapted mixed distribution models selected based on the first likelihoods.
1 Assignment
0 Petitions
Accused Products
Abstract
A process for removing additive noise due to the influence of ambient circumstances in a real-time manner in order to improve the precision of speech recognition which is performed in a real-time manner includes a converting process for converting a selected speech model distribution into a representative distribution, combining a noise model with the converted to generate speech model a noise superimposed speech model, performing a first likelihood calculation to recognize an input speech by using the noise superimposed speech model, converting the noise superimposed speech model to a noise adapted distribution that retains the relationship of the selected speech model, and performing a second likelihood calculation to recognize the input speech by using the noise adapted distribution.
-
Citations
22 Claims
-
1. A speech recognition method using hidden Markov models, comprising the steps of:
-
converting low noise mixed distribution Markov models into single distribution models;
combining a noise model with the single distribution models to generate noise superimposed speech models;
converting at least some of the noise superimposed speech models to noise adapted mixed distribution models that retain the relationship of parameters in low noise mixed distribution Markov models;
calculating a first set of likelihoods that an input speech utterance corresponds to the noise adapted mixed distribution models;
selecting at least one noise adapted mixed distribution model at least partially based on the first set of likelihoods; and
associating the input speech utterance with the low noise mixed distribution Markov models corresponding to the noise adapted mixed distribution models selected based on the first likelihoods. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
holding a likelihood value as a result of said first likelihood calculation at a first time; and
deciding that a result in which the held likelihood value was added in a searching process at a second time is a likelihood value of a search result at the second time.
-
-
3. A method according to claim 1, further comprising the steps of:
-
calculating a second set of likelihoods that an input speech utterance corresponds to the noise superimposed speech models; and
selecting a subset of the noise superimposed speech models based on the second set of likelihoods, wherein said noise-adapted-mixed-distribution-models converting step comprises the step of converting the models in said subset to noise adapted mixed distribution models.
-
-
4. A method according to claim 3, further comprising the steps of:
sorting a result of said first likelihood calculation and said second likelihood calculation.
-
5. A method according to claim 1, further comprising the step of displaying recognition candidate characters on a display.
-
6. A method according to claim 1, further comprising the step of printing recognition candidate characters by printing means.
-
7. A method according to claim 1, further comprising the step of recognizing a speech inputted from a microphone.
-
8. A method according to claim 1, further comprising the step of inputting speech through a communication line.
-
9. A method according to claim 1, further comprising the step of performing an operation control of an application in accordance with a recognition result.
-
10. A speech recognition apparatus using hidden Markov models, comprising:
- first converting means for converting low noise mixed distribution Markov models into single distribution models;
combining means for combining a noise model with the single distribution models to generate noise superimposed speech models;
second converting means for converting at least some of the noise superimposed speech models to noise adapted mixed distribution models that retain the relationship of parameters in low noise mixed distribution Markov models;
calculating means for calculating a first set of likelihoods that an input speech utterance corresponds to the noise adapted mixed distribution models;
means for selecting at least one noise adapted mixed distribution model at least partially based on the first set of likelihoods calculated by said calculating means; and
associating means for associating the input speech utterance with the low noise mixed distribution Markov models corresponding to the noise adapted mixed distribution models selected based on the first likelihoods by said selecting means. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
holding means for holding a likelihood value as a result of said first likelihood calculation at a first time; and
likelihood value deciding means for deciding that a result in which the held likelihood value was added in a searching process at a second time is a likelihood value of a search result at the second time.
- first converting means for converting low noise mixed distribution Markov models into single distribution models;
-
12. An apparatus according to claim 10, further comprising:
-
calculating means for calculating a second set of likelihoods that an input speech utterance corresponds to the noise superimposed speech models; and
selecting means for selecting a subset of the noise superimposed speech models based on the second set of likelihoods, wherein said noise-adapted-mixed-distribution-models converting means comprises means for converting the models in said subset to noise adapted mixed distribution models.
-
-
13. An apparatus according to claim 12, further comprising:
sorting means for sorting a result of said first likelihood calculation and said second likelihood calculation.
-
14. An apparatus according to claim 10, further comprising display means for displaying recognition candidate characters.
-
15. An apparatus according to claim 10, further comprising printing means for printing recognition candidate characters.
-
16. An apparatus according to claim 10, further comprising a microphone for inputting a speech.
-
17. An apparatus according to claim 10, further comprising a communication line interface for inputting speech.
-
18. An apparatus according to claim 10, further comprising control means for performing an operation control of an application in accordance with a recognition result.
-
19. A computer-readable medium encoded with a program using hidden Markov models, said program comprising the steps of:
-
converting low noise mixed distribution Markov models into single distribution models;
combining a noise model with the single distribution models to generate noise superimposed speech models;
converting at least some of the noise superimposed speech models to noise adapted mixed distribution models that retain the relationship of parameters in low noise mixed distribution Markov models;
calculating a first set of likelihoods that an input speech utterance corresponds to the noise adapted mixed distribution models;
selecting at least one noise adapted mixed distribution model at least partially based on the first set of likelihoods; and
associating the input speech utterance with the low noise mixed distribution Markov models corresponding to the noise adapted mixed distribution models selected based on the first likelihoods. - View Dependent Claims (20, 21, 22)
holding a likelihood value as a result of said first likelihood calculation at a first time; and
deciding that a result in which the held likelihood value was added in a searching process at a second time is a likelihood value of a search result at the second time.
-
-
21. A medium according to claim 19, said program further comprising the steps of:
-
calculating a second set of likelihoods that an input speech utterance corresponds to the noise superimposed speech models; and
selecting a subset of the noise superimposed speech models based on the second set of likelihoods, wherein said noise-adapted-mixed-distribution-models converting step comprises the step of converting the models in said subset to noise adapted mixed distribution models.
-
-
22. A computer-readable medium according to claim 21, further comprising the steps of:
sorting a result of said first likelihood calculation and said second likelihood calculation.
Specification