Signal bias removal for robust telephone speech recognition

US 5,590,242 A
Filed: 03/24/1994
Issued: 12/31/1996
Est. Priority Date: 03/24/1994
Status: Expired due to Term

First Claim

Patent Images

1. A method for reducing the effect of an unknown signal bias in an input speech signal for use by a speech recognition system, comprising:

(1) training the speech recognition system by using the following steps;

(a) generating a set of centroids based on a training speech signal;

(b) computing an estimate of the bias for the training speech signal based on maximizing a likelihood function;

(c) subtracting the estimate of the bias from the training speech signal to obtain a tentative training speech value;

(d) repeating steps (b) and (c), wherein each subsequent computed estimate of the bias is based on the previous tentative training speech value to arrive at a reduced bias training speech signal value;

(e) recomputing the centroids based on the reduced bias training speech signal to generate a new set of centroids;

(f) repeating steps (b) to (e) to compute a processed reduced bias speech signal and to form an enhanced set of centroids;

(g) utilizing the enhanced set of centroids and the processed reduced bias speech signal as training input for a speech recognizer;

(2) testing an input speech signal to minimize the unknown bias by using the following steps;

(h) utilizing the enhanced set of centroids to compute an estimate of the bias for each utterance of the speech signal based on maximizing a likelihood function;

(i) subtracting the estimate of the bias from the speech signal to obtain a tentative speech value;

(j) repeating steps (h) and (i), wherein each subsequent computed estimate of the bias is based on the previous tentative speech value, resulting in a reduced bias speech signal value; and

(3) utilizing the reduced bias speech signal as input to a speech recognizer.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A signal bias removal (SBR) method based on the maximum likelihood estimation of the bias for minimizing undesirable effects in speech recognition systems is described. The technique is readily applicable in various architectures including discrete (vector-quantization based), semicontinuous and continuous-density Hidden Markov Model (HMM) systems. For example, the SBR method can be integrated into a discrete density HMM and applied to telephone speech recognition where the contamination due to extraneous signal components is unknown. To enable real-time implementation, a sequential method for the estimation of the bias (SSBR) is disclosed.

Citations

13 Claims

1. A method for reducing the effect of an unknown signal bias in an input speech signal for use by a speech recognition system, comprising:
- (1) training the speech recognition system by using the following steps;
  
  (a) generating a set of centroids based on a training speech signal;
  
  (b) computing an estimate of the bias for the training speech signal based on maximizing a likelihood function;
  
  (c) subtracting the estimate of the bias from the training speech signal to obtain a tentative training speech value;
  
  (d) repeating steps (b) and (c), wherein each subsequent computed estimate of the bias is based on the previous tentative training speech value to arrive at a reduced bias training speech signal value;
  
  (e) recomputing the centroids based on the reduced bias training speech signal to generate a new set of centroids;
  
  (f) repeating steps (b) to (e) to compute a processed reduced bias speech signal and to form an enhanced set of centroids;
  
  (g) utilizing the enhanced set of centroids and the processed reduced bias speech signal as training input for a speech recognizer;
  
  (2) testing an input speech signal to minimize the unknown bias by using the following steps;
  
  (h) utilizing the enhanced set of centroids to compute an estimate of the bias for each utterance of the speech signal based on maximizing a likelihood function;
  
  (i) subtracting the estimate of the bias from the speech signal to obtain a tentative speech value;
  
  (j) repeating steps (h) and (i), wherein each subsequent computed estimate of the bias is based on the previous tentative speech value, resulting in a reduced bias speech signal value; and
  
  (3) utilizing the reduced bias speech signal as input to a speech recognizer.
- View Dependent Claims (2)
- - 2. The method of claim 1, wherein the speech recognition system utilizes a Hidden Markov Model speech recognizer.

3. A method for minimizing the effect of an unknown signal bias on an input speech signal during the testing phase of a speech recognition system, comprising:
- (a) computing an estimate of the bias for each utterance of the speech signal based on maximizing a likelihood function by initially utilizing a set of centroids generated by a training model;
  
  (b) subtracting the estimate of the bias from the input speech signal to obtain a tentative speech value;
  
  (c) repeating steps (a) and (b) a predetermined number of times, wherein each subsequent computed estimate of the bias is based on the previous tentative speech value, resulting in a reduced bias speech signal value; and
  
  (d) utilizing the reduced bias speech signal value as input to a speech recognizer.
- View Dependent Claims (4)
- - 4. The method of claim 3, wherein a vector quantization method is utilized to generate the centroids of step (a).

5. A method for sequentially reducing the effect of an unknown signal bias of an input speech signal for a speech recognition system, comprising:
- (1) training the speech recognition system by using the following steps;
  
  (a) generating a set of centroids based on a training speech signal;
  
  (b) analyzing the speech signal on a frame by frame basis or in a batch mode;
  
  (c) computing an estimate of the bias for the training speech signal based on maximizing a likelihood function;
  
  (d) subtracting the estimate of the bias from the training speech signal to obtain a tentative training speech value;
  
  (e) repeating steps (c) and (d), wherein each subsequent computed estimate of the bias is based on the previous tentative training speech value to arrive at a reduced bias training speech signal value;
  
  (f) recomputing the centroids based on the reduced biased training speech signal value to generate a new set of centroids;
  
  (g) repeating steps (c) to (f) to compute a processed reduced bias speech signal and to generate an enhanced set of centroids;
  
  (h) utilizing the enhanced set of centroids and the processed reduced bias speech signal as training input for a speech recognizer;
  
  (2) testing an input speech signal to minimize the unknown bias by using the following steps;
  
  (i) analyzing an utterance on a frame-by-frame basis;
  
  (j) computing a sequential bias estimate for each frame of the speech signal based on maximizing a likelihood function;
  
  (k) subtracting the sequential bias estimate from the input speech signal at every frame to obtain a tentative speech value;
  
  (l) repeating steps (j) and (k), wherein each subsequent computed estimate of the bias is based on the previous tentative speech value, resulting in a reduced bias speech signal value; and
  
  (3) utilizing the reduced bias speech signal as input to a speech recognizer.
- View Dependent Claims (6, 9)
- - 6. The method of claim 5, wherein the speech recognition system utilizes a Hidden Markov Model speech recognizer.
  - 9. The method of claim 5, wherein step (k) includes setting a weighting coefficient.

7. A method for sequentially reducing the effect of an unknown signal bias on an input speech signal during the testing phase of a speech recognition system, comprising:
- (a) analyzing an utterance on a frame-by-frame basis;
  
  (b) computing a sequential bias estimate for each frame of the speech signal based on maximizing a likelihood function by utilizing a set of centroids generated by a training model;
  
  (c) subtracting the sequential bias estimate from the input speech signal at every frame to obtain a tentative speech value;
  
  (d) repeating steps (b) and (c), wherein each subsequent computed estimate of the bias is based on the previous tentative speech value, resulting in a reduced bias speech signal value; and
  
  (e) utilizing the reduced bias speech signal as input to a speech recognizer.
- View Dependent Claims (8, 10)
- - 8. The method of claim 7, wherein a vector quantization method is utilized to compute the centroids of step (b).
  - 10. The method of claim 7, wherein step (c) includes setting a weighting coefficient.

11. A method for generating an enhanced set of centroids representative of an input speech signal for use by a speech recognition system, utilizing an initial set of centroids based on a training speech signal, comprising:
- (a) computing an estimate of the bias for the training speech signal based on maximizing a likelihood function;
  
  (b) subtracting the estimate of the bias from the training speech signal to obtain a tentative training speech value;
  
  (c) repeating steps (a) and (b), wherein each subsequent computed estimate of the bias is based on the previous tentative training speech value to arrive at a reduced bias training speech signal value;
  
  (d) recomputing the centroids based on the reduced bias training speech signal to generate a new set of centroids; and
  
  (e) repeating steps (a) to (d) to compute a processed reduce bias speech signal to form an enhanced set of centroids.
- View Dependent Claims (12, 13)
- - 12. The method of claim 11, further comprising utilizing the enhanced set of centroids and the processed reduced bias speech signal to train a speech recognizer.
  - 13. The method of claim 11, further comprising:
    - (a) utilizing the enhanced centroids to compute an estimate of the bias for each utterance of an input speech signal based on maximizing a likelihood function;
      
      (b) subtracting the estimate of the bias from the speech signal to obtain a tentative speech value;
      
      (c) repeating steps (a) and (b), wherein each subsequent computed estimate of the bias is based on the previous tentative speech value, to result in a reduced bias speech signal value; and
      
      (d) utilizing the reduced bias speech signal value as input to a speech recognizer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Rahim, Mazin G., Juang, Biing-Hwang
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
CHOWDHURY, INDRINAL

Application Number

US08/217,035
Time in Patent Office

1,013 Days
Field of Search

395/2.31, 395/2.42, 395/2.45, 395/2.52, 395/2.54, 395/2.6
US Class Current

704/245
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/20   Speech recognition techniqu...

G10L 21/0232   Processing in the frequency...

Signal bias removal for robust telephone speech recognition

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Signal bias removal for robust telephone speech recognition

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links