Speech recognizer

US 5,737,488 A
Filed: 06/07/1995
Issued: 04/07/1998
Est. Priority Date: 06/13/1994
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognizer implemented in a programmed processor and configured to recognize input speech by analyzing the input speech at predetermined time intervals, extracting feature vectors and calculating a likelihood value of a reference pattern model of each category in a plurality of categories to be recognized, comprising:

receiving means for receiving the input speech;

extracting means for extracting the feature vectors from the input speech;

first probability calculation means for calculating a probability that a first hidden Markov model having an internal state number as an output symbol for each category to be recognized outputs an internal state number and for outputting a series of the internal state numbers;

second probability calculation means for calculating a probability that a second hidden Markov model having transition probabilities of the internal state numbers and feature vector output probability distributions for each of the respective internal state numbers outputs a feature vector and for outputting a series of the feature vectors; and

likelihood probability calculation means for calculating a probability of a reference pattern model of each category to be recognized by using the outputs of the first and second probability calculation means,wherein the reference pattern model corresponding to a highest probability is output as a recognition result of the input speech.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In order to recognize input speech by analyzing the input speech at a predetermined time interval, extracting feature vectors and calculating likelihood value of a reference pattern model of each category to be recognized. A first probability calculation means calculates a probability that a first hidden Markov model having internal state number as output symbol for each of categories to be recognized outputs an internal state number. A second probability calculation means calculates a probability that a second hidden Markov model having transition probabilities of internal state number and feature vector output probability distribution for each of the respective internal state numbers outputs a feature vector. A likelihood value calculation means calculates likelihood value of a reference pattern model of the category to be recognized by using outputs of the first and second probability calculation means.

13 Citations

View as Search Results

6 Claims

1. A speech recognizer implemented in a programmed processor and configured to recognize input speech by analyzing the input speech at predetermined time intervals, extracting feature vectors and calculating a likelihood value of a reference pattern model of each category in a plurality of categories to be recognized, comprising:
- receiving means for receiving the input speech;
  
  extracting means for extracting the feature vectors from the input speech;
  
  first probability calculation means for calculating a probability that a first hidden Markov model having an internal state number as an output symbol for each category to be recognized outputs an internal state number and for outputting a series of the internal state numbers;
  
  second probability calculation means for calculating a probability that a second hidden Markov model having transition probabilities of the internal state numbers and feature vector output probability distributions for each of the respective internal state numbers outputs a feature vector and for outputting a series of the feature vectors; and
  
  likelihood probability calculation means for calculating a probability of a reference pattern model of each category to be recognized by using the outputs of the first and second probability calculation means,wherein the reference pattern model corresponding to a highest probability is output as a recognition result of the input speech.
- View Dependent Claims (2)
- - 2. The speech recognizer as set forth in claim 1, wherein the likelihood probability calculation means executes the probability calculation by using only the internal state numbers providing the maximum probability at the predetermined time intervals on an input speech feature vector time series time axis and in each state of the first hidden Markov model of each category to be recognized.

3. A speech recognizer comprising:
- receiving means for receiving input speech;
  
  extracting means for extracting at least one feature vector from the input speech;
  
  a first HMM parameter memory for storing as first HMM parameters of individual words w, transition probability a_mn.sup.(1) m, n=1, . . . , N_w) from state m to state n, and probability b_nk.sup.(1) (k=1, . . . , K) of outputting output symbol s_k in state n, wherein N_w represents a total number of states in the word w and K represents a total number of internal state numbers;
  
  a second HMM parameter memory for storing, as second HMM parameters common to all of the words, parameters of distribution functions representing transition probability a_jk.sup.(2) (j, k=1, . . . , K) from internal state j to internal state k and output probability b_k.sup.(2) (o_t) of outputting feature vector o_t in internal state k;
  
  a work memory for tentatively storing the output probability and array variables A (w,t,n,k) representing a forward probability when calculating the likelihood value of each word to be recognized with a reference pattern model, where t represents an instant in time; and
  
  recognition processing means implemented in a programmed processor, the recognition processing means including;
  
  first calculation means for calculating the output probability b_k.sup.(2) (o_t) of outputting the feature vector o_t in the internal state k on the basis of the output probability distribution parameters stored in the second HMM parameter memory and storing the output probability as variable B in the work memory, the feature vector o_t corresponding to the at least one feature vector extracted by the extracting means;
  
  clearing means for clearing the array variables A (w,t,n,k) in the work memory for calculating the forward probability;
  
  second calculation means for calculating a contribution to the forward probability when the feature vector o_t is output through transition from state m and internal state j to state n and internal state k, from the parameters stored in the first and second HMM parameter memories and work memory and adding the forward probability to the array variables A (w,t,n,k) representing the forward probability;
  
  means for comparing the forward probability A (w,T,n,k) for each word w stored in the work memory successively to obtain one of the words w having a maximum comparison value, where T represents a total number of time intervals of the input speech; and
  
  outputting means for outputting the one word having the maximum comparison value as a recognition result.
- View Dependent Claims (4, 5)
- - 4. The speech recognizer as set forth in claim 3, wherein the work memory stores the array variables A (w,t,n,k) only for times t and (t-1).
  - 5. The speech recognizer as set forth in claim 3, wherein the first and second HMM parameter memories and work memory are defined as distinct memory areas in a main memory.

6. A word speech recognizer for recognizing words from a speech signal, comprising:
- an input unit for inputting the speech signal;
  
  a feature vector extraction unit connected to the input unit and configured to sample the speech signal, digitize the sampled speech signal, and convert the digitized sampled speech signal into at least one feature vectora first Hidden Markov Model (HMM) parameter memory configured to store first HMM parameters of a plurality of words, transition probabilities a_m for transitioning from state m to state n, wherein m and n are integers and wherein there are k possible states, k being an integer greater than or equal to m and n, the first HMM parameter memory being configured to store a probabilities b_rs of outputting a symbol s in a state r, wherein r and s are integers;
  
  a second HMM parameter memory configured to store second HMM parameters common to all of the plurality of words, which correspond to distribution functions representing transition probabilities a_jk from internal state j to internal state k, and which correspond to output probabilities b_k (o_t) of outputting feature vector o_t in the internal state k;
  
  a work memory for temporarily storing the output probabilities, the work memory also temporarily storing a forward probability and array variables associated with the forward probability; and
  
  a processor coupled to the feature vector extraction unit, the first HMM parameter memory, the second HMM parameter memory, and the work memory, the processor comprising;
  
  a calculating unit configured to receive the feature vector o₁ from the feature vector extraction unit and to calculate the output probability b_k (o₁) of outputting the feature vector o₁ in the internal state b_k based on the second HMM parameters stored in the second HMM parameter memory, the output probability b_k (o₁) being stored by the processor in the work memory;
  
  a clearing unit configured to clear the forward probabilities stored in the work memory;
  
  a forward probability calculating unit configured to calculate the forward probability for each of the plurality of words when the feature vector o₁ is output through transition from the state m and the internal state j to the state n and the internal state k, the forward probability being calculated based on the first HMM parameters stored in the first HMM parameter memory and the second HMM parameters stored in the second HMM parameter memory; and
  
  a determining unit for determining a maximum probability of the forward probabilities calculated for each of the plurality of words, wherein the corresponding word having the maximum probability is output as a recognized word of the speech signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NEC Corporation
Original Assignee
NEC Corporation
Inventors
Iso, Ken-Ichi
Primary Examiner(s)
Zele, Krista
Assistant Examiner(s)
WEAVER, SCOTT LOUIS

Application Number

US08/483,321
Time in Patent Office

1,035 Days
Field of Search

395/2, 395/2.1, 395/2.6, 395/2.64, 395/2.65
US Class Current

704/256
CPC Class Codes

G10L 15/142 Hidden Markov Models [HMMs]

Speech recognizer

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

13 Citations

6 Claims

Specification

Use Cases

Quick Links

Others

Speech recognizer

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

13 Citations

6 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others