Speech recognizer
First Claim
1. A speech recognizer implemented in a programmed processor and configured to recognize input speech by analyzing the input speech at predetermined time intervals, extracting feature vectors and calculating a likelihood value of a reference pattern model of each category in a plurality of categories to be recognized, comprising:
- receiving means for receiving the input speech;
extracting means for extracting the feature vectors from the input speech;
first probability calculation means for calculating a probability that a first hidden Markov model having an internal state number as an output symbol for each category to be recognized outputs an internal state number and for outputting a series of the internal state numbers;
second probability calculation means for calculating a probability that a second hidden Markov model having transition probabilities of the internal state numbers and feature vector output probability distributions for each of the respective internal state numbers outputs a feature vector and for outputting a series of the feature vectors; and
likelihood probability calculation means for calculating a probability of a reference pattern model of each category to be recognized by using the outputs of the first and second probability calculation means,wherein the reference pattern model corresponding to a highest probability is output as a recognition result of the input speech.
1 Assignment
0 Petitions
Accused Products
Abstract
In order to recognize input speech by analyzing the input speech at a predetermined time interval, extracting feature vectors and calculating likelihood value of a reference pattern model of each category to be recognized. A first probability calculation means calculates a probability that a first hidden Markov model having internal state number as output symbol for each of categories to be recognized outputs an internal state number. A second probability calculation means calculates a probability that a second hidden Markov model having transition probabilities of internal state number and feature vector output probability distribution for each of the respective internal state numbers outputs a feature vector. A likelihood value calculation means calculates likelihood value of a reference pattern model of the category to be recognized by using outputs of the first and second probability calculation means.
13 Citations
6 Claims
-
1. A speech recognizer implemented in a programmed processor and configured to recognize input speech by analyzing the input speech at predetermined time intervals, extracting feature vectors and calculating a likelihood value of a reference pattern model of each category in a plurality of categories to be recognized, comprising:
-
receiving means for receiving the input speech; extracting means for extracting the feature vectors from the input speech; first probability calculation means for calculating a probability that a first hidden Markov model having an internal state number as an output symbol for each category to be recognized outputs an internal state number and for outputting a series of the internal state numbers; second probability calculation means for calculating a probability that a second hidden Markov model having transition probabilities of the internal state numbers and feature vector output probability distributions for each of the respective internal state numbers outputs a feature vector and for outputting a series of the feature vectors; and likelihood probability calculation means for calculating a probability of a reference pattern model of each category to be recognized by using the outputs of the first and second probability calculation means, wherein the reference pattern model corresponding to a highest probability is output as a recognition result of the input speech. - View Dependent Claims (2)
-
-
3. A speech recognizer comprising:
-
receiving means for receiving input speech; extracting means for extracting at least one feature vector from the input speech; a first HMM parameter memory for storing as first HMM parameters of individual words w, transition probability amn.sup.(1) m, n=1, . . . , Nw) from state m to state n, and probability bnk.sup.(1) (k=1, . . . , K) of outputting output symbol sk in state n, wherein Nw represents a total number of states in the word w and K represents a total number of internal state numbers; a second HMM parameter memory for storing, as second HMM parameters common to all of the words, parameters of distribution functions representing transition probability ajk.sup.(2) (j, k=1, . . . , K) from internal state j to internal state k and output probability bk.sup.(2) (ot) of outputting feature vector ot in internal state k; a work memory for tentatively storing the output probability and array variables A (w,t,n,k) representing a forward probability when calculating the likelihood value of each word to be recognized with a reference pattern model, where t represents an instant in time; and recognition processing means implemented in a programmed processor, the recognition processing means including; first calculation means for calculating the output probability bk.sup.(2) (ot) of outputting the feature vector ot in the internal state k on the basis of the output probability distribution parameters stored in the second HMM parameter memory and storing the output probability as variable B in the work memory, the feature vector ot corresponding to the at least one feature vector extracted by the extracting means; clearing means for clearing the array variables A (w,t,n,k) in the work memory for calculating the forward probability; second calculation means for calculating a contribution to the forward probability when the feature vector ot is output through transition from state m and internal state j to state n and internal state k, from the parameters stored in the first and second HMM parameter memories and work memory and adding the forward probability to the array variables A (w,t,n,k) representing the forward probability; means for comparing the forward probability A (w,T,n,k) for each word w stored in the work memory successively to obtain one of the words w having a maximum comparison value, where T represents a total number of time intervals of the input speech; and outputting means for outputting the one word having the maximum comparison value as a recognition result. - View Dependent Claims (4, 5)
-
-
6. A word speech recognizer for recognizing words from a speech signal, comprising:
-
an input unit for inputting the speech signal; a feature vector extraction unit connected to the input unit and configured to sample the speech signal, digitize the sampled speech signal, and convert the digitized sampled speech signal into at least one feature vector a first Hidden Markov Model (HMM) parameter memory configured to store first HMM parameters of a plurality of words, transition probabilities am for transitioning from state m to state n, wherein m and n are integers and wherein there are k possible states, k being an integer greater than or equal to m and n, the first HMM parameter memory being configured to store a probabilities brs of outputting a symbol s in a state r, wherein r and s are integers; a second HMM parameter memory configured to store second HMM parameters common to all of the plurality of words, which correspond to distribution functions representing transition probabilities ajk from internal state j to internal state k, and which correspond to output probabilities bk (ot) of outputting feature vector ot in the internal state k; a work memory for temporarily storing the output probabilities, the work memory also temporarily storing a forward probability and array variables associated with the forward probability; and a processor coupled to the feature vector extraction unit, the first HMM parameter memory, the second HMM parameter memory, and the work memory, the processor comprising; a calculating unit configured to receive the feature vector o1 from the feature vector extraction unit and to calculate the output probability bk (o1) of outputting the feature vector o1 in the internal state bk based on the second HMM parameters stored in the second HMM parameter memory, the output probability bk (o1) being stored by the processor in the work memory; a clearing unit configured to clear the forward probabilities stored in the work memory; a forward probability calculating unit configured to calculate the forward probability for each of the plurality of words when the feature vector o1 is output through transition from the state m and the internal state j to the state n and the internal state k, the forward probability being calculated based on the first HMM parameters stored in the first HMM parameter memory and the second HMM parameters stored in the second HMM parameter memory; and a determining unit for determining a maximum probability of the forward probabilities calculated for each of the plurality of words, wherein the corresponding word having the maximum probability is output as a recognized word of the speech signal.
-
Specification