Speech recognition system having a quantizer using a single robust codebook designed at multiple signal to noise ratios
First Claim
1. A speech recognition system comprising:
- a first quantizer to receive a representation of a speech input signal and generate quantization output data representing the closeness of the represented speech input signal to codewords in a single codebook, the single codebook of the quantizer representing a vocabulary of u words, where u is a non-negative integer greater than one and the single codebook further having a mixture of s signal to noise ratio codeword entries; and
a speech signal processor to receive the quantization output data from the quantizer, the speech signal processor having processes trained with respective quantization output data for the u vocabulary words and having speech input signal classifying output data as recognized speech.
9 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a speech recognition system is organized with a fuzzy matrix quantizer with a single codebook representing u codewords. The single codebook is designed with entries from u codebooks which are designed with respective words at multiple signal to noise ratio levels. Such entries are, in one embodiment, centroids of clustered training data. The training data is, in one embodiment, derived from line spectral frequency pairs representing respective speech input signals at various signal to noise ratios. The single codebook trained in this manner provides a codebook for a robust front end speech processor, such as the fuzzy matrix quantizer, for training a speech classifier such as a u hidden Markov models and a speech post classifier such as a neural network. In one embodiment, a fuzzy Viterbi algorithm is used with the hidden Markov models to describe the speech input signal probabilistically.
-
Citations
29 Claims
-
1. A speech recognition system comprising:
-
a first quantizer to receive a representation of a speech input signal and generate quantization output data representing the closeness of the represented speech input signal to codewords in a single codebook, the single codebook of the quantizer representing a vocabulary of u words, where u is a non-negative integer greater than one and the single codebook further having a mixture of s signal to noise ratio codeword entries; and a speech signal processor to receive the quantization output data from the quantizer, the speech signal processor having processes trained with respective quantization output data for the u vocabulary words and having speech input signal classifying output data as recognized speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A speech recognition system comprising:
-
a means for computing P order linear predictive code prediction coefficients to represent each a word input signal segmented into N frames, where P and N are a non-negative integers; means for deriving line spectral pair coefficients from the prediction coefficients; a fuzzy matrix quantizer for processing a P×
N matrix of the line spectral pair coefficients and providing fuzzy matrix quantization data from a single codebook having a group of C codewords for each of u vocabulary words, wherein each of the u groups of C codewords is derived from s groups of C codewords designed at s signal to noise ratios, wherein s and C are a non-negative integers;a plurality of u hidden Markov models, respectively trained using the single codebook, for modeling respective speech processes and for producing respective output data in response to receipt of the quantization data; a means for receiving the quantization data and the respective output data of the u hidden Markov models for determining a probability for each of the u hidden Markov models that the respective hidden Markov model produced the quantization data; and means for receiving the probabilities for training to classify the word input signal and for classifying the word input signal as one of the u vocabulary words. - View Dependent Claims (15)
-
-
16. A method comprising the steps of:
-
designing a single codebook having a vocabulary of u words, each word being represented by codewords designed with test speech input signals corrupted by s signal to noise ratios, where u and s are non-negative integers; generating a respective response by the designed single codebook to each of the test speech input signals corrupted by the s signal to noise ratios; and training a hidden Markov model for each of the u words with each of the responses generated by the single codebook. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A method comprising the steps of:
-
designing a single codebook having a vocabulary of u words, each word being represented by codewords designed with test speech input signals corrupted by s signal to noise ratios, where u and s are non-negative integers; generating a respective response by the designed single codebook to each of the test speech input signals corrupted by the s signal to noise ratios; training a hidden Markov model for each of the u words with each of the responses generated by the single codebook; for each test speech input signal, determining a respective probability for each of the hidden Markov models to classify the test speech input signal; and training a neural network to recognize the speech input signal using each respective probability for each of the hidden Markov models. - View Dependent Claims (29)
-
Specification