Adaptive decision directed speech recognition bias equalization method and apparatus

US 5,812,972 A
Filed: 12/30/1994
Issued: 09/22/1998
Est. Priority Date: 12/30/1994
Status: Expired due to Term

First Claim

Patent Images

1. An apparatus for recognizing in real time speech signals produced under changing ambient conditions by a plurality of speakers, the apparatus comprising:

a speech analyzer operable to generate a plurality of feature vectors from an input speech signal;

a memory device containing speech model vectors; and

a speech recognizer operably connected to receive speech model vectors from the memory device, said speech recognizer operable to;

a) receive an observation sequence comprising a plurality of feature vectors from the speech analyzer;

b) modify at least one feature vector using an equalization vector determined in an ongoing manner;

c) generate a segmentation vector corresponding to the modified feature vector using the speech model vectors;

d) generate a subsequent equalization vector based upon the difference between the segmentation vector and the corresponding feature vectors; and

e) remove signal bias including bias caused by ambient noise.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention provides a speech recognizer that creates and updates the equalization vector as input speech is provided to the recognizer. The present invention includes a speech analyzer which transforms an input speech signal into a series of feature vectors or observation sequence. Each feature vector is then provided to a speech recognizer which modifies the feature vector by subtracting a previously determined equalization vector therefrom. The recognizer then performs segmentation and matches the modified feature vector to a stored model vector which is defined as the segmentation vector. The recognizer then, from time to time, determines a new equalization vector, the new equalization vector being defined based on the difference between one or more input feature vectors and their respective segmentation vectors. The new equalization vector may then be used either for performing another segmentation iteration on the same observation sequence or for performing segmentation on subsequent feature vectors.

Citations

18 Claims

1. An apparatus for recognizing in real time speech signals produced under changing ambient conditions by a plurality of speakers, the apparatus comprising:
- a speech analyzer operable to generate a plurality of feature vectors from an input speech signal;
  
  a memory device containing speech model vectors; and
  
  a speech recognizer operably connected to receive speech model vectors from the memory device, said speech recognizer operable to;
  
  a) receive an observation sequence comprising a plurality of feature vectors from the speech analyzer;
  
  b) modify at least one feature vector using an equalization vector determined in an ongoing manner;
  
  c) generate a segmentation vector corresponding to the modified feature vector using the speech model vectors;
  
  d) generate a subsequent equalization vector based upon the difference between the segmentation vector and the corresponding feature vectors; and
  
  e) remove signal bias including bias caused by ambient noise.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The apparatus of claim 1 wherein the recognizer is further operable to:
    - perform the operations of b) and c) for the plurality of feature vectors before performing the operation of d), andwherein the recognizer is further operable to generate a subsequent equalization vector based upon the weighted average difference between the plurality of feature vectors and the plurality corresponding of segmentation vectors.
  - 3. The apparatus of claim 2 wherein the recognizer is further operable to:
    - e) modify at least one feature vector using the subsequent equalization vector; and
      
      f) generate a subsequent segmentation vector corresponding to the modified feature vector using the speech model vectors.
  - 4. The apparatus of claim 3 wherein the recognizer comprises a hidden Markov model speech recognizer.
  - 5. The apparatus of claim 1 wherein the recognizer comprises a hidden Markov model speech recognizer.
  - 6. The apparatus of claim 1 wherein the recognizer is further operable to generate a subsequent equalization vector based upon the vector sum of the equalization vector and the difference between the feature vector and the corresponding segmentation vector, said difference being adjusted by a scaling factor.
  - 7. The apparatus of claim 1 wherein the recognizer is further operable to generate a most likely state sequence corresponding to the observation sequence.

8. A method of processing input speech signals produced under changing ambient conditions by a plurality of speakers comprising:
- a) generating a plurality of feature vectors from an input speech signal;
  
  b) providing at least one feature vector to a speech recognizer;
  
  c) employing the speech recognizer to modify at least one feature vector using an equalization vector determined in an ongoing manner;
  
  d) employing dynamic programming to determine at least one state of a most likely state sequence based on at least one modified feature vector;
  
  e) employing the speech recognizer to generate at least one segmentation vector from at least one modified feature vector using a plurality of speech model vectors; and
  
  f) generating a subsequent equalization vector based upon the difference between at least one segmentation vector and at least one corresponding feature vector.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The method of claim 8 wherein step d) further comprises determining at least one state based on a spectral similarity between at least one modified feature vector and at least one speech model vector.
  - 10. The method of claim 8 further comprising the step of repeating steps b), c) and e) for a plurality of feature vectors before executing step f), andwherein step f) further comprises generating a subsequent equalization vector based upon the average difference between the plurality of feature vectors and the corresponding plurality of segmentation vectors.
  - 11. The method of claim 10 further comprising the steps of:
    - g) employing the speech recognizer to modify the plurality of feature vectors using the subsequent equalization vector; and
      
      h) employing dynamic programming to determine at least one state of a subsequent most likely state sequence based on at least one modified feature vector.
  - 12. The method of claim 8 wherein the speech recognizer comprises a hidden Markov model speech recognizer.
  - 13. The method of claim 8 wherein step d) further comprises generating a subsequent equalization vector based upon the vector sum of the equalization vector and the difference between the feature vector and the segmentation vector, said difference being adjusted by a scaling factor.

14. An apparatus for providing voice control of a system, the apparatus comprising:
- a speech input device operable to receive input speech from a plurality of users where said input speech is produced under changing ambient conditions and generate speech signals;
  
  a speech analyzer connected to receive speech signals from the speech input device and generate feature vectors representative of the speech signals;
  
  a speech recognizer connected to receive feature vectors from the speech analyzer, said speech recognizer operable tomodify each feature vector using an equalization vector determined in an ongoing manner;
  
  generate a most likely state sequence corresponding to the modified feature vectors;
  
  generate a segmentation vector for at least one modified feature vector;
  
  generate a subsequent equalization vector based upon the difference between one or more segmentation vectors and their respective feature vectors; and
  
  a data extraction device operable to receive segmentation vectors from the speech recognizer and produce control data therefrom, said control data being usable by a controller in the system.
- View Dependent Claims (15, 16, 17, 18)
- - 15. The apparatus of claim 14 further comprising a controller operable to receive the control data from the data extraction device and further operable to control the system based upon the input speech.
  - 16. The apparatus of claim 15 wherein the controller is connected to a plurality of telephone extensions and the controller is operable to connect the speech input device to a voice-selected telephone extension.
  - 17. The apparatus of claim 14 wherein the speech input device includes a telephone.
  - 18. The apparatus of claim 14 further comprising a plurality of speech input devices, each speech input device operably connected to provide input speech signals to the speech analyzer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Alcatel-Lucent USA, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Wilpon, Jay Gordon, Mansour, David, Juang, Biing-Hwang
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Smits, Talivaldis Ivars

Application Number

US08/366,657
Time in Patent Office

1,362 Days
Field of Search

395/2.42, 395/2.43, 395/2.6, 395/2.84, 381/43, 704/233, 704/234, 704/251, 704/275
US Class Current

704/234
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/04   Segmentation; Word boundary...

G10L 15/144   Training of HMMs

G10L 2015/0635   updating or merging of old ...

Adaptive decision directed speech recognition bias equalization method and apparatus

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Adaptive decision directed speech recognition bias equalization method and apparatus

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links