Apparatus for speech recognition
First Claim
1. Apparatus for speech recognition, comprising:
- (a) spectrum analyzer means for obtaining parameters indicative of the spectrum of an input speech signal, the spectrum analyzer means performing a linear prediction analysis of the input speech signal for obtaining a set of LPC cepstrum coefficients for the input speech signal(b) a standard pattern storing means for storing phoneme standard patterns of phonemes or phoneme groups;
(c) a similarity calculating means for calculating the degree of similarity between the LPC cepstrum coefficients derived from said spectrum analyzer means and standard patterns stored in said standard pattern storing means, said calculating means determining a measure of the statistical distance between the LPC cepstrum coefficients and the standard patterns;
(d) a segmentation means for segmenting the input speech signal in response to the statistical distance measure derived by said similarity calculating portion and time-dependent power variations in low- and high-frequency ranges of the inout speech signal; and
(e) a phoneme discriminating means of recognizing phonemes in response to a signal derived by said similarity calculating means.
1 Assignment
0 Petitions
Accused Products
Abstract
Apparatus for speech recognition, having each phoneme as a fundamental recognition unit, recognizes input speech by discriminating phonemes in the input speech. The apparatus comprises a memory for storing phoneme standard patterns of phonemes or phoneme groups; a spectrum analyzer for obtaining parameters indicative of the input speech signal spectrum; a statistical distance measure similarity calculator calculates the degree of similarity between the output of the spectrum analyzer and standard patterns stored in the memory; a segmentation portion for segmenting by using time-dependent low- and high-frequency power variations of the input speech signal and results from the similarity calculator; and a phoneme discriminator for recognizing phonemes by using the results from the similarity calculator.
-
Citations
26 Claims
-
1. Apparatus for speech recognition, comprising:
-
(a) spectrum analyzer means for obtaining parameters indicative of the spectrum of an input speech signal, the spectrum analyzer means performing a linear prediction analysis of the input speech signal for obtaining a set of LPC cepstrum coefficients for the input speech signal (b) a standard pattern storing means for storing phoneme standard patterns of phonemes or phoneme groups; (c) a similarity calculating means for calculating the degree of similarity between the LPC cepstrum coefficients derived from said spectrum analyzer means and standard patterns stored in said standard pattern storing means, said calculating means determining a measure of the statistical distance between the LPC cepstrum coefficients and the standard patterns; (d) a segmentation means for segmenting the input speech signal in response to the statistical distance measure derived by said similarity calculating portion and time-dependent power variations in low- and high-frequency ranges of the inout speech signal; and (e) a phoneme discriminating means of recognizing phonemes in response to a signal derived by said similarity calculating means. - View Dependent Claims (2, 17)
-
-
3. Apparatus for speech recognition, comprising:
-
(a) a spectrum analyzer means for deriving spectrum information of an input speech signal, said spectrum information being a set of LPC cepstrum coefficients obtained by way of linear predictive analysis; (b) a first similarity calculating means for obtaining, by using a statistical distance measure, the degree of similarity of said input speech to phonemes of vowel features, voiced sounds and unvoiced sounds, said calculating means calculating the degree of similarity between the LPC spectrum coefficients derived from said spectrum analyzing means and standard patterns stored in a standard pattern storing means; (c) a first recognition means for segmenting and recognizing the input speech signal in response to a continuity of the statistical distance derived by said first similarity calculation means; (d) a segmentation parameter extracting means for deriving power information of low- and high-frequency ranges of said input speech signal; (e) a consonant segmentation means for segmenting consonant phonemes in response to signals representing the results of time-dependent variations of low- and high-frequency ranges of said power information in the input speech signal; (f) a second similarity calculation means for calculating, by using a statistical distance measure, the degree of similarity between coefficients derived from said spectrum analyzing means and standard phoneme patterns from said standard pattern storing portion of respective periods determined by said consonant segmentation portion; (g) a second recognition means for recognizing consonant phonemes in response to the degree of similarity determined by said second similarity means; (h) a phoneme string producing means for deriving phoneme strings in response to the degree of similarity determined by said first recognition means and the results from said second recognition portion; and (i) a matching means for comparison/matching the phoneme strings derived from said phoneme string producing means and dictionary items included in a word dictionary so as to derive a dictionary item having the highest similarity to said phoneme strings. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
18. A method of recognizing speech, comprising the steps of analyzing the spectral content of the speech by determining a set of linear predictive cepstrum coefficients obtained by performing a linear predictive analysis on the speech;
-
performing a statistical distance measure between the speech with phonemes of vowel features, voiced sounds and unvoiced sounds by calculating the degree of similarity between the LPC cepstrum coefficients and stored standard patterns; segmenting and recognizing the speech in response to a continuity of the statistical distance determined during the immediately preceding step; extracting parameter segments of the input speech by deriving power information of low- and high-frequency ranges of the speech; segmenting consonant phonemes of the speech in response to the statistical distance similarity calculation and time-dependent variations of low- and high-frequency ranges of the power information in the speech; calculating the degree of similarity between coefficients derived from the spectral analysis and stored standard consonant phoneme patterns; recognizing consonant phonemes in response to the degree of similarity determined during the immediately preceding step; deriving phoneme strings in response to the degree of similarity determined by both of the similarity calculations; and comparing/matching the phoneme strings derived from the phoneme strings and dictionary items to derive a dictionary item having the greatest similarity with the phoneme strings. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26)
-
Specification