Adaptive speech recognition with selective input data to a speech classifier
First Claim
1. A speech recognition system comprising:
- a first speech signal preprocessor to receive first input data representing a speech input signal and having first speech input signal preclassifying output data;
a second speech signal preprocessor to receive second input data representing the speech input signal and having second speech input signal preclassifying output data;
a mixer to receive the first and second speech input signal preclassifying output data and having output data represented by a selected mix of the first and second speech input signal preclassifying output data;
a selection control circuit coupled to the mixer to determine the selected mix of the first and second speech input signal preclassifying output data by determining an appropriate balance between speech recognition accuracy of the speech recognition system and a speech recognition processing speed of the speech recognition system; and
a speech classifier to receive the selected mix and having output data to classify the speech input signal as recognized speech.
10 Assignments
0 Petitions
Accused Products
Abstract
One embodiment of a speech recognition system is organized with speech input signal preprocessing and feature extraction followed by a fuzzy matrix quantizer (FMQ) designed with respective codebook sets at multiple signal to noise ratios. The FMQ quantizes various training words from a set of vocabulary words and produces observation sequences O output data to train a hidden Markov model (HMM) processes λj and produces fuzzy distance measure output data for each vocabulary word codebook. A fuzzy Viterbi algorithm is used by a processor to compute maximum likelihood probabilities PR(O|λj) for each vocabulary word. The fuzzy distance measures and maximum likelihood probabilities are mixed in a variety of ways to preferably optimize speech recognition accuracy and speech recognition speed performance.
-
Citations
35 Claims
-
1. A speech recognition system comprising:
-
a first speech signal preprocessor to receive first input data representing a speech input signal and having first speech input signal preclassifying output data; a second speech signal preprocessor to receive second input data representing the speech input signal and having second speech input signal preclassifying output data; a mixer to receive the first and second speech input signal preclassifying output data and having output data represented by a selected mix of the first and second speech input signal preclassifying output data; a selection control circuit coupled to the mixer to determine the selected mix of the first and second speech input signal preclassifying output data by determining an appropriate balance between speech recognition accuracy of the speech recognition system and a speech recognition processing speed of the speech recognition system; and a speech classifier to receive the selected mix and having output data to classify the speech input signal as recognized speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A speech recognition system comprising:
-
a speech input signal feature extractor to provide parameters representing features of T groups of N speech input signal frames; a vocabulary of u words; a matrix quantizer to receive the parameters and to provide (i) a series of observation sequences for each of the T groups of the N speech input signal frames and (ii) distance measure output data between the parameters and u respective matrix codebooks; a plurality of u hidden Markov models coupled to the matrix quantizer to receive the observation sequences; a Viterbi algorithm module to receive the observation sequences and provide respective probabilities that the respective hidden Markov models produced a respective observation sequence; a selection control circuit to determine when the distance measure output, the probabilities, and a combination of the distance measure output and the probabilities are included in a plurality of selected mixes by determining an appropriate balance between speech recognition accuracy of the speech recognition system and a speech recognition processing speed of the speech recognition system; a mixer coupled to the matrix quantizer and the Viterbi algorithm module for mixing the distance measure output and the probabilities into one set of mixed output data based on the selected mixes; and a neural network coupled to the mixer to receive the mixed output data set and determine which of the u vocabulary words most probably represents the speech input signal. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A speech recognition system comprising:
-
means for processing first speech input signal data to preclassify the speech input signal and produce first preclassification output data, wherein the first speech input signal data represents a speech input signal; means for processing second speech input signal data to preclassify the speech input signal and produce second preclassification output data; means, coupled to both means for processing, for determining when to include the first speech input signal, the second speech input signal, and a combination of the first and second speech input signals in a preferred mix of the preclassification output data by determining an appropriate balance between speech recognition accuracy of the speech recognition system and a speech recognition processing speed of the speech recognition system; means, coupled to the means for determining, for mixing the first and second preclassification output data in accordance with the determined preferred mix; means, coupled to the means for mixing, for classifying the speech input signal based on the preferred mix of preclassification output data. - View Dependent Claims (21)
-
-
22. A speech recognition method comprising the steps of:
-
processing first speech input signal data to preclassify the speech input signal and produce first preclassification output data, wherein the first speech input signal data represents a speech input signal; processing second speech input signal data to preclassify the speech input signal and produce second preclassification output data; determining when to include the first speech input signal, the second speech input signal, and a combination of the first and second speech input signals in a preferred mix of the preclassification output data by determining at least an appropriate balance between speech recognition accuracy and a speech recognition processing speed; mixing the first and second preclassification output data in accordance with the preferred mix; and classifying the speech input signal based on the preferred mix of preclassification output data. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31)
-
-
32. A speech recognition system comprising:
-
a first speech signal preprocessor to receive first input data representing a speech input signal and having first speech input signal preclassifying output data; a second speech signal preprocessor to receive second input data representing the speech input signal and having second speech input signal preclassifying output data; a mixer to receive the first arid second speech input signal preclassifying output data and having output data represented by a selected mix of the first and second speech input signal preclassifying output data; a non-neural network selection control circuit coupled to the mixer to determine when to include the first speech input signal, the second speech input signal, and a combination of the first and second speech input signals in the selected mix; and a speech classifier to receive the selected mix and having output data to classify the speech input signal as recognized speech.
-
-
33. A speech recognition system comprising:
-
a first speech signal preprocessor to receive first input data representing a speech input signal and having first speech input signal preclassifying output data; a second speech signal preprocessor to receive second input data representing the speech input signal and having second speech input signal preclassifying output data; a mixer to receive the first and second speech input signal preclassifying output data and having output data represented by a selected mix of the first and second speech input signal preclassifying output data; a selection control circuit coupled to the mixer to determine when to include the first speech input signal, the second speech input signal, and a combination of the first and second speech input signals in the selected mix; a speech classifier to receive the selected mix and having output data to classify the speech input signal as recognized speech; and a noise level detector to provide a noise level parameter output signal to the selection control circuit. - View Dependent Claims (34, 35)
-
Specification