Speech recognition device and speech recognition method
First Claim
1. A speech recognition device for recognizing speech of unspecified speakers using hidden Markov models, the device comprising:
- detection means for detecting feature parameters of input speech;
data storage means for storing transition probabilities and output probability functions, which use, as arguments, said feature parameters for multiple predetermined types of hidden Markov models, the models each representing a plurality of predetermined words; and
recognition means for determining the occurrence probability that a sequence of said feature parameters will occur using said hidden Markov models, said recognition means assigning each of said words a state sequence of one hidden Markov model common to said multiple types of hidden Markov models, the occurrence probability determined by selecting a largest product of a transition probability with an output probability function associated with each state of said common hidden Markov model, wherein the input speech is recognized based on the occurrence probability thus determined.
2 Assignments
0 Petitions
Accused Products
Abstract
Each word to be recognized is represented by gender-specific hidden Markov models that are stored in a ROM 6 along with output probability functions and preset transition probabilities. A speech recognizer 4 determines an occurrence probability of a feature parameter sequence detected by a feature value detector 3 using the hidden Markov models. The speech recognizer 4 determines the occurrence probability by giving each word a state sequence of one hidden Markov model common to the gender-specific hidden Markov models, multiplying each preset pair of an output probability function value and a transition probability together among the output probability functions and transition probabilities stored in the ROM 6, selecting the largest product as the probability of each state of the common hidden Markov model, determining the occurrence probability based on the selected product, and recognizing the input speech based on the occurrence probability thus determined.
25 Citations
10 Claims
-
1. A speech recognition device for recognizing speech of unspecified speakers using hidden Markov models, the device comprising:
-
detection means for detecting feature parameters of input speech;
data storage means for storing transition probabilities and output probability functions, which use, as arguments, said feature parameters for multiple predetermined types of hidden Markov models, the models each representing a plurality of predetermined words; and
recognition means for determining the occurrence probability that a sequence of said feature parameters will occur using said hidden Markov models, said recognition means assigning each of said words a state sequence of one hidden Markov model common to said multiple types of hidden Markov models, the occurrence probability determined by selecting a largest product of a transition probability with an output probability function associated with each state of said common hidden Markov model, wherein the input speech is recognized based on the occurrence probability thus determined. - View Dependent Claims (2, 3, 9)
-
-
4. A speech recognition device for recognizing speech of unspecified speakers using hidden Markov models, said device comprising:
-
detection means for detecting feature parameters of input speech;
data storage means for storing transition probabilities and output probability functions, which use as arguments, said feature parameters for hidden Markov models (HMMs), each of said HMM representing a plurality of predetermined words and for a plurality of hidden Markov models which partially express differences in pronunciations of each of words which are allowed multiple pronunciations out of said predetermined words; and
recognition means for determining the occurrence probability that a sequence of said feature parameters will occur using said hidden Markov models, said recognition means sharing a state sequence of one hidden Markov model among said plurality of hidden Markov models for partial expression, and the occurrence probability determined by selecting a largest product of a transition probability with an output probability function associated with each state of said plurality of hidden Markov models for partial expression, wherein the input speech is recognized based on the occurrence probability thus determined.
-
-
5. A method for recognizing input speech:
-
detecting feature parameters of said input speech;
retrieving transition probabilities and output probability functions, said transition probabilities and said output probability functions associated with multiple predetermined types of hidden Markov models which represent each of a plurality of predetermined words;
determining the occurrence probability that a sequence of said feature parameters will occur using said hidden Markov models, wherein each of said words is represented by a hidden Markov model common to said multiple types of hidden Markov models;
multiplying each preset pair of an output probability function value and a transition probability together among the output probability functions and transition probabilities and selecting the largest product as the probability of each state of said common hidden Markov model; and
recognizing the input speech by selecting the hidden Markov model having the largest occurrence probability. - View Dependent Claims (6, 7, 10)
-
-
8. A speech recognition method comprising the steps of:
-
detecting feature parameters of said input speech;
retrieving transition probabilities and output probability functions, said transition probabilities and said output probability functions associated with a plurality of hidden Markov models which represent each of a plurality of predetermined words and with a plurality of hidden Markov models that partially express differences in pronunciations of each of said words which are allowed multiple pronunciations out of said predetermined words;
determining the occurrence probability that a sequence of said feature parameters will occur using said hidden Markov models, wherein for each of said words allowing multiple pronunciations a common hidden Markov model shares a state sequence among said plurality of hidden Markov models for partial expression;
multiplying each pair of an output probability function value and a transition probability together among the output probability functions and transition probabilities characterizing said plurality of hidden Markov models for partial expression and selecting the largest product as the probability of each state of said common hidden Markov model; and
recognizing the input speech by selecting the hidden Markov model having the largest occurrence probability.
-
Specification