Speech recognition apparatus using neural network and fuzzy logic
First Claim
1. A speech recognition apparatus comprising:
- input means for inputting speech;
feature extraction means for extracting feature vectors from the input speech in each of a series of predetermined times and for obtaining a feature vector series;
candidate selection means for selecting high-ranking candidates of recognition result by matching the feature vector series with various categories;
pair generation means for generating a plurality of pairs of candidates from the candidates selected by said candidate selection means;
pair discrimination means for discriminating between each candidate of each pair of selected candidates, wherein said pair discrimination means comprises neural network means for extracting several acoustic cues specific to a respective pair from the feature vector series, said neural network means having respectively suitable structures for extracting the several acoustic cues by setting up connection coefficients based on information stored in a first memory, and logic means for selecting the most certain one of the several acoustic cues based on extracted results of said neural network means; and
decision means for ranking the selected candidates based on a pair discrimination result of said pair discrimination means, thereby representing which candidate of the selected candidates corresponds to the input speech.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition apparatus has a speech input unit for inputting a speech; a speech analysis unit for analyzing the inputted speech to output the time series of a feature vector; a candidates selection unit for inputting the time series of a feature vector from the speech analysis unit to select a plurality of candidates of recognition result from the speech categories; and a discrimination processing unit for discriminating the selected candidates to obtain a final recognition result. The discrimination processing unit includes three components in the form of a pair generation unit for generating all of the two combinations of the n-number of candidates selected by said candidate selection unit, a pair discrimination unit for discriminating which of the candidates of the combinations is more certain for each of all n C2 -number of combinations (or pairs) on the basis of the extracted result of the acoustic feature intrinsic to each of said candidate speeches, and a final decision unit for collecting all the pair discrimination results obtained from the pair discrimination unit for each of all the n C2 -number of combinations (or pairs) to decide the final result. The pair discrimination unit handles the extracted result of the acoustic feature intrinsic to each of the candidate speeches as fuzzy information and accomplishes the discrimination processing on the basis of fuzzy logic algorithms, and the final decision unit accomplishes its collections on the basis of the fuzzy logic algorithms.
-
Citations
12 Claims
-
1. A speech recognition apparatus comprising:
-
input means for inputting speech; feature extraction means for extracting feature vectors from the input speech in each of a series of predetermined times and for obtaining a feature vector series; candidate selection means for selecting high-ranking candidates of recognition result by matching the feature vector series with various categories; pair generation means for generating a plurality of pairs of candidates from the candidates selected by said candidate selection means; pair discrimination means for discriminating between each candidate of each pair of selected candidates, wherein said pair discrimination means comprises neural network means for extracting several acoustic cues specific to a respective pair from the feature vector series, said neural network means having respectively suitable structures for extracting the several acoustic cues by setting up connection coefficients based on information stored in a first memory, and logic means for selecting the most certain one of the several acoustic cues based on extracted results of said neural network means; and decision means for ranking the selected candidates based on a pair discrimination result of said pair discrimination means, thereby representing which candidate of the selected candidates corresponds to the input speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A speech recognition apparatus comprising:
-
input unit inputting speech and converting the input speed into a digital signal; a spectral analysis unit for extracting feature vectors from the digital signal of the input speech in each of a series of predetermined times and for obtaining a feature vector series; a candidate selection unit for selecting high-ranking candidates of various phonemes by matching the feature vector series with the various phonemes; a pair generator for generating a plurality of pairs of candidates from the candidates selected by said candidate selector; a pair discrimination unit for discriminating between each candidate of each pair of selected candidates, wherein said pair discrimination unit comprises neural networks for extracting several acoustic cues specific to the respective pair from the feature vector series, said neural networks having respectively suitable structures for extracting the several acoustic cues by setting up connection coefficients based on information stored in a first memory, and a fuzzy logic unit for selecting the most certain one of the several acoustic cues based on extracted results of said neural networks; and a decision unit ranking the selected candidates based on pair discrimination results of said pair discrimination unit, thereby representing which candidate of the selected candidates corresponds to the input speech. - View Dependent Claims (10, 11, 12)
-
Specification