Speech recognition system which avoids ambiguity when matching frequency spectra by employing an additional verbal feature
First Claim
1. A speech recognition system comprisinga frequency analyzer which performs frequency analysis of the input speech into a number of channels, performs logarithmic conversion and extracts a frequency spectrum,a voiced interval detector which detects voiced intervals on the basis of the said frequency spectrum,a spectrum normalizer which determines a least square fit line for the frequency spectrum and normalizes the frequency spectrum with reference to the least square fit line to provide a normalized frequency spectrum pattern,a spectrum reference pattern memory in which spectrum reference patterns are stored in advance,a spectrum similarity calculator which calculates the similarity between the said normalized spectrum pattern and a spectrum reference pattern for each of a plurality of recognition categories, andan identifier which, as the result of the spectrum similarity calculation, outputs the name of the recognition category which has the highest similarity,the speech recognition system further comprising:
- (a) a second feature pattern calculator which calculates a second feature pattern, said second feature pattern calculator (a) including a spectrum variation pattern calculator which, for each frame in a voiced interval, calculates for each channel, as the second feature pattern, a spectrum variation pattern which quantifies the degree and the direction of the transition between channels of the normalized spectrum with the advance of time around the said frame, and(b) a second feature reference pattern memory in which second feature reference patterns have previously been stored, said second feature reference pattern memory (b) including a spectrum variation reference pattern memory in which spectrum variation reference patterns, as said second feature reference patterns, have previously been stored, and(c) a second feature similarity calculator which calculates the similarity between the said second feature pattern and the second feature reference patterns with respect to each recognition category, said second feature similarity calculator (c) including a spectrum variation similarity calculator which calculates the similarity between the said spectrum variation pattern and the spectrum variation reference patterns with respect to each of the recognition categories, and in that(d) in the identifier, the overall similarity is calculated for each of the recognition categories by reference to both the similarity of the spectrum and the similarity of the second feature, and the category giving the largest overall similarity is output as the recognition result.
1 Assignment
0 Petitions
Accused Products
Abstract
In a speech recognition system using normalized spectrum matching, a second feature pattern is calculated and compared with reference patterns. The similarity obtained as a result of the comparison is used to determine overall similarity, from which the recognition is made. The second feature pattern can be a spectrum variation pattern, a level decrease pattern, or a spectrum relative value pattern.
21 Citations
9 Claims
-
1. A speech recognition system comprising
a frequency analyzer which performs frequency analysis of the input speech into a number of channels, performs logarithmic conversion and extracts a frequency spectrum, a voiced interval detector which detects voiced intervals on the basis of the said frequency spectrum, a spectrum normalizer which determines a least square fit line for the frequency spectrum and normalizes the frequency spectrum with reference to the least square fit line to provide a normalized frequency spectrum pattern, a spectrum reference pattern memory in which spectrum reference patterns are stored in advance, a spectrum similarity calculator which calculates the similarity between the said normalized spectrum pattern and a spectrum reference pattern for each of a plurality of recognition categories, and an identifier which, as the result of the spectrum similarity calculation, outputs the name of the recognition category which has the highest similarity, the speech recognition system further comprising: -
(a) a second feature pattern calculator which calculates a second feature pattern, said second feature pattern calculator (a) including a spectrum variation pattern calculator which, for each frame in a voiced interval, calculates for each channel, as the second feature pattern, a spectrum variation pattern which quantifies the degree and the direction of the transition between channels of the normalized spectrum with the advance of time around the said frame, and (b) a second feature reference pattern memory in which second feature reference patterns have previously been stored, said second feature reference pattern memory (b) including a spectrum variation reference pattern memory in which spectrum variation reference patterns, as said second feature reference patterns, have previously been stored, and (c) a second feature similarity calculator which calculates the similarity between the said second feature pattern and the second feature reference patterns with respect to each recognition category, said second feature similarity calculator (c) including a spectrum variation similarity calculator which calculates the similarity between the said spectrum variation pattern and the spectrum variation reference patterns with respect to each of the recognition categories, and in that (d) in the identifier, the overall similarity is calculated for each of the recognition categories by reference to both the similarity of the spectrum and the similarity of the second feature, and the category giving the largest overall similarity is output as the recognition result. - View Dependent Claims (2, 3)
-
-
4. A speech recognition system comprising
a frequency analyzer which performs frequency analysis of the input speech into a number of channels, performs logarithmic conversion and extracts a frequency spectrum, a voiced interval detector which detects voiced intervals on the basis of the said frequency spectrum, a spectrum normalizer which determines a least square fit line for the frequency spectrum and normalizes the frequency spectrum with reference to the least square fit line to provide a normalized frequency spectrum pattern, a spectrum reference pattern memory in which spectrum reference patterns are stored in advance, a spectrum similarity calculator which calculates the similarity between the said normalized spectrum pattern and a spectrum reference pattern for each of a plurality of recognition categories, and an identifier which, as the result of the spectrum similarity calculation, outputs the name of the recognition category which has the highest similarity, the speech recognition system further comprising: -
(a) a second feature pattern calculator which calculates a second feature pattern, said second feature pattern calculator (a) including a level decrease pattern calculator which, for each frame in a voiced interval, determines whether the particular frame is a voiceless frame in accordance with the level of input speech for the frame with respect to the maximum value of the input speech level, and calculates, as said second feature pattern, a level decrease pattern for the voiceless frame which quantitizes the level decrease relative to the maximum value of the input speech level in the particular voiceless frame, (b) a second feature reference pattern memory in which second feature reference patterns have previously been stored, said second feature reference pattern memory (b) including a level decrease reference pattern memory in which level decrease reference patterns have previously been stored, and (c) a second feature similarity calculator which calculates the similarity between the said second feature pattern and the second feature reference patterns with respect to each recognition category, said second feature similarity calculator (c) including a level decrease similarity calculator which calculates the similarity between the said level decrease pattern and the level decrease reference patterns with respect to each of the recognition categories, and in that (d) in the identifier, the overall similarity is calculated for each of the recognition categories by reference to both the similarity of the spectrum and the similarity of the second feature, and the category giving the largest overall similarity is output as the recognition result. - View Dependent Claims (5, 6)
-
-
7. A speech recognition system comprising
a frequency analyzer which performs frequency analysis of the input speech into a number of channels, performs logarithmic conversion and extracts a frequency spectrum, a voiced interval detector which detects voiced intervals on the basis of the said frequency spectrum, a spectrum normalizer which determines a least square fit line for the frequency spectrum and normalizes the frequency spectrum with reference to the least square fit line to provide a normalized frequency spectrum pattern, a spectrum reference pattern memory in which spectrum reference patterns are stored in advance, a spectrum similarity calculator which calculates the similarity between the said normalized spectrum pattern and a spectrum reference pattern for each of a plurality of recognition categories, and an identifier which, as the result of the spectrum similarity calculation, outputs the name of the recognition category which has the highest similarity, the speech recognition system further comprising: -
(a) a second feature pattern calculator which calculates a second feature pattern, said second feature pattern calculator (a) including a spectrum relative value calculator which, for each channel, calculates as the spectrum relative value pattern, the relative value of the normalized spectrum in a voiced interval with respect to a normalized spectrum average value in each frame, (b) a second feature reference pattern memory in which second feature reference patterns have previously been stored, said second feature reference pattern memory (b) including a spectrum relative value reference pattern memory storing spectrum relative value reference patterns, and (c) a second feature similarity calculator which calculates the similarity between the said second feature pattern and the second feature reference patterns with respect to each recognition category, said second variable similarity calculator (c) including a spectrum relative value similarity calculator which calculates the similarity between the spectrum relative value pattern and the spectrum relative value reference patterns for each of the recognition categories, and in that (d) in the identifier, the overall similarity is calculated for each of the recognition categories by reference to both the similarity of the spectrum and the similarity of the second feature, and the category giving the largest overall similarity is output as the recognition result. - View Dependent Claims (8, 9)
-
Specification