Adaptive decision directed speech recognition bias equalization method and apparatus
First Claim
1. An apparatus for recognizing in real time speech signals produced under changing ambient conditions by a plurality of speakers, the apparatus comprising:
- a speech analyzer operable to generate a plurality of feature vectors from an input speech signal;
a memory device containing speech model vectors; and
a speech recognizer operably connected to receive speech model vectors from the memory device, said speech recognizer operable to;
a) receive an observation sequence comprising a plurality of feature vectors from the speech analyzer;
b) modify at least one feature vector using an equalization vector determined in an ongoing manner;
c) generate a segmentation vector corresponding to the modified feature vector using the speech model vectors;
d) generate a subsequent equalization vector based upon the difference between the segmentation vector and the corresponding feature vectors; and
e) remove signal bias including bias caused by ambient noise.
5 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a speech recognizer that creates and updates the equalization vector as input speech is provided to the recognizer. The present invention includes a speech analyzer which transforms an input speech signal into a series of feature vectors or observation sequence. Each feature vector is then provided to a speech recognizer which modifies the feature vector by subtracting a previously determined equalization vector therefrom. The recognizer then performs segmentation and matches the modified feature vector to a stored model vector which is defined as the segmentation vector. The recognizer then, from time to time, determines a new equalization vector, the new equalization vector being defined based on the difference between one or more input feature vectors and their respective segmentation vectors. The new equalization vector may then be used either for performing another segmentation iteration on the same observation sequence or for performing segmentation on subsequent feature vectors.
-
Citations
18 Claims
-
1. An apparatus for recognizing in real time speech signals produced under changing ambient conditions by a plurality of speakers, the apparatus comprising:
-
a speech analyzer operable to generate a plurality of feature vectors from an input speech signal; a memory device containing speech model vectors; and a speech recognizer operably connected to receive speech model vectors from the memory device, said speech recognizer operable to; a) receive an observation sequence comprising a plurality of feature vectors from the speech analyzer; b) modify at least one feature vector using an equalization vector determined in an ongoing manner; c) generate a segmentation vector corresponding to the modified feature vector using the speech model vectors; d) generate a subsequent equalization vector based upon the difference between the segmentation vector and the corresponding feature vectors; and e) remove signal bias including bias caused by ambient noise. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method of processing input speech signals produced under changing ambient conditions by a plurality of speakers comprising:
-
a) generating a plurality of feature vectors from an input speech signal; b) providing at least one feature vector to a speech recognizer; c) employing the speech recognizer to modify at least one feature vector using an equalization vector determined in an ongoing manner; d) employing dynamic programming to determine at least one state of a most likely state sequence based on at least one modified feature vector; e) employing the speech recognizer to generate at least one segmentation vector from at least one modified feature vector using a plurality of speech model vectors; and f) generating a subsequent equalization vector based upon the difference between at least one segmentation vector and at least one corresponding feature vector. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. An apparatus for providing voice control of a system, the apparatus comprising:
-
a speech input device operable to receive input speech from a plurality of users where said input speech is produced under changing ambient conditions and generate speech signals; a speech analyzer connected to receive speech signals from the speech input device and generate feature vectors representative of the speech signals; a speech recognizer connected to receive feature vectors from the speech analyzer, said speech recognizer operable to modify each feature vector using an equalization vector determined in an ongoing manner; generate a most likely state sequence corresponding to the modified feature vectors; generate a segmentation vector for at least one modified feature vector; generate a subsequent equalization vector based upon the difference between one or more segmentation vectors and their respective feature vectors; and a data extraction device operable to receive segmentation vectors from the speech recognizer and produce control data therefrom, said control data being usable by a controller in the system. - View Dependent Claims (15, 16, 17, 18)
-
Specification