Method of speech recognition resistant to convolutive distortion and additive distortion
First Claim
Patent Images
1. A method of speech recognition comprising the steps of:
- receiving speech utterances and sensing noise during speech pauses;
providing models for recognizing speech;
estimating additive noise during pauses in speech to provide additive noise estimate;
adapting the models to additive noise and convolutional channel bias to provide adapted models with adapted model states;
comparing input speech utterances with the adapted models for recognizing the speech and providing an alignment between an input speech utterance and recognized models andestimating the convolutional channel bias by an iterative statistical maximum-likelihood method using said alignment and said additive noise estimate that maximizes the channel bias based on the probability likelihood of each speech data input frame feature vector of said input speech utterance mapping to the existing adapted model states to generate the maximum-likelihood channel bias estimate.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognizer operating in both ambient noise (additive distortion) and microphone changes (convolutive distortion) is provided. For each utterance to be recognized the recognizer system adapts HMM mean vectors with noise estimates calculated from pre-utterance pause and a channel estimate calculated using an Estimation Maximization algorithm from previous utterances.
136 Citations
20 Claims
-
1. A method of speech recognition comprising the steps of:
-
receiving speech utterances and sensing noise during speech pauses; providing models for recognizing speech; estimating additive noise during pauses in speech to provide additive noise estimate; adapting the models to additive noise and convolutional channel bias to provide adapted models with adapted model states; comparing input speech utterances with the adapted models for recognizing the speech and providing an alignment between an input speech utterance and recognized models and estimating the convolutional channel bias by an iterative statistical maximum-likelihood method using said alignment and said additive noise estimate that maximizes the channel bias based on the probability likelihood of each speech data input frame feature vector of said input speech utterance mapping to the existing adapted model states to generate the maximum-likelihood channel bias estimate. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
11. A speech recognizer comprising:
-
a microphone and /or sensor for receiving speech utterances and sensing noise during pauses; models for recognizing speech; an additive noise estimator for estimating additive noise during pauses in speech to provide an additive noise estimate; a model adapter subsystem for adapting the models to additive noise and convolutional channel bias to provide adapted models with adapted model states; a recognizer subsystem which compares input speech utterances with the adapted models for recognizing the speech and outputs an alignment between an input speech utterance and recognized models and a convolutional channel bias estimator that estimates the convolutional channel bias by a an iterative statistical maximum-likelihood method using said alignment and said additive noise estimate that maximizes the channel bias based on the probability likelihood of each speech data input frame feature vector of said input speech utterance mapping to the existing adapted model states to generate a maximum-likelihood channel bias estimate.
-
Specification