Training of homoscedastic hidden Markov models for automatic speech recognition

US 5,473,728 A
Filed: 02/24/1993
Issued: 12/05/1995
Est. Priority Date: 02/24/1993
Status: Expired due to Fees

First Claim

Patent Images

1. A method for training a speech recognizer in a speech recognition system, said method comprising the steps of:

providing a data base containing a plurality of acoustic speech units;

generating a homoscedastic hidden Markov model (HMM) from said plurality of acoustic speech units in said data base;

said generating step comprises forming a set of pooled training data from said plurality of acoustic speech units and estimating a single global covariance matrix using said pooled training data set, said single global covariance matrix representing a tied covariance matrix for every Gaussian probability density function (PDF) for every state of every hidden Markov model structure in said homoscedastic hidden Markov model; and

loading said homoscedastic hidden Markov model into the speech recognizer.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for training a speech recognizer in a speech recognition system is described. The method of the present invention comprises the steps of providing a data base containing acoustic speech units, generating a homoscedastic hidden Markov model from the acoustic speech units in the data base, and loading the homoscedastic hidden Markov model into the speech recognizer. The hidden Markov model loaded into the speech recognizer has a single covariance matrix which represents the tied covariance matrix of every Gaussian probability density function PDF for every state of every hidden Markov model structure in the homoscedastic hidden Markov model.

179 Citations

10 Claims

1. A method for training a speech recognizer in a speech recognition system, said method comprising the steps of:
- providing a data base containing a plurality of acoustic speech units;
  
  generating a homoscedastic hidden Markov model (HMM) from said plurality of acoustic speech units in said data base;
  
  said generating step comprises forming a set of pooled training data from said plurality of acoustic speech units and estimating a single global covariance matrix using said pooled training data set, said single global covariance matrix representing a tied covariance matrix for every Gaussian probability density function (PDF) for every state of every hidden Markov model structure in said homoscedastic hidden Markov model; and
  
  loading said homoscedastic hidden Markov model into the speech recognizer.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein the training data set forming step comprises:
    - forming the training data set from all possible ones of said plurality of speech units in said data base, said training data set containing a number of observations for each of said speech units; and
      
      collecting the training data for said set so that each said observation retains a true speech unit label.
  - 3. The method of claim 2 further comprising:
    - forming a HMM for each speech unit.
  - 4. The method of claim 3 further comprising:
    - conducting a plurality of training iterations wherein a forward state likelihood for a Markov chain state, a backward state likelihood for said Markov chain state and a component state likelihood for said Markov chain state and a mixture Gaussian PDF component are computed recursively for a given measurement vector, training sequence and speech unit.
  - 5. The method of claim 4 wherein said estimating step comprises:
    - estimating said single covariance matrix from said likelihoods; and
      
      storing said estimated single covariance matrix in a computer used to perform said training iterations.
  - 6. The method of claim 5 wherein said conducting step includes:
    - updating parameter estimates including said estimated single covariance matrix using said likelihoods in the following equations;
      
      a. Initial state probability;
      
      ##EQU30## b. State transition probability;
      
      ##EQU31## c. Within class mixing proportions;
      
      ##EQU32## d. Component means;
      
      ##EQU33## e. Covariance matrix;
      
      ##EQU34## where θ
      
      _m (i)=probability of speech unit m starting in state (i),F_opm =initial forward component state likelihood,B_opm =initial backward component state likelihood,a_m (i,j)=probability of speech unit m moving from state i to state j,F_kpm =forward component state likelihood,g_jm =Gaussian mixture PDF,X_kpm =a measurement vector,λ
      
      _jm =a parameter set data for HMM structure m,B_kpm (i)=backward component state likelihood,π
      
      _cim =mixture component probability,C_kpm (i,c)=component state likelihood,μ
      
      _cim =mean vector of the component Gaussian PDF,Σ
      
      =covariance matrix,K_pm =measurement vector,G_im =Gaussian component in mixture PDF associated with state i of HMM structure m,T=entire set of training data,T_m =training set for speech unit m,S_m =states for HMM structure,M=number of acoustic/phonetic speech units m,k=measurement vector,p=training sequence,i=Markov state,c=mixture Gaussian PDF component, anddetermining an updated estimated single covariance matrix.
  - 7. The method of claim 6 further comprising:
    - continuing said training iterations until a stable solution for said covariance matrix is found; and
      
      said updating step including updating and storing said covariance matrix after each training iteration.
  - 8. The method of claim 1 wherein said data base providing step comprises:
    - providing a speech preprocessor;
      
      transforming a raw input speech signal inputted into said speech preprocessor into said plurality of acoustic speech units; and
      
      storing said plurality of acoustic speech units in a storage device.
  - 9. The method of claim 8 wherein said homoscedastic hidden Markov model generating step further comprises:
    - transferring information concerning said plurality of acoustic speech units stored in said storage device to a training computer programmed to form said homoscedastic hidden Markov model; and
      
      mathematically converting said information into a series of hidden Markov structures from which said homoscedastic hidden Markov model having a single covariance matrix is formed.
  - 10. The method of claim 9 wherein said loading step comprises:
    - storing said homoscedastic hidden Markov model with said single covariance matrix in a computer forming said speech recognizer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
The United States of America As Represented By The Secretary of Agriculture
Original Assignee
the united states of america as represented by the secretary of the navy
Inventors
Rosseau, Michael L., Streit, Roy L., Luginbuhl, Tod E.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
Doerrler, Michelle

Application Number

US08/022,218
Time in Patent Office

1,014 Days
Field of Search

395/2.4, 395/2.45, 395/2.52-2.54, 395/2.64-2.66, 381/41-43
US Class Current

704/243
CPC Class Codes

G10L 15/144 Training of HMMs

Training of homoscedastic hidden Markov models for automatic speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

179 Citations

10 Claims

Specification

Use Cases

Quick Links

Others

Training of homoscedastic hidden Markov models for automatic speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

179 Citations

10 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others