Method and apparatus for word speech recognition by pattern matching
First Claim
1. A word speech recognition method which performs pattern matching between an unknown speech pattern and multiple reference templates and detects that one of said multiple reference templates which corresponds to the smallest one of distance measures between said unknown speech pattern and said multiple reference templates, said method comprising the steps of:
- (a) analyzing an unknown input digital speech signal for each frame and extracting therefrom a sequence of spectral parameters;
(b) detecting start and end points of the speech period of said input digital speech signal and obtaining said sequence of spectral parameters of said input digital speech signal for said speech period as said unknown speech pattern;
(c) selecting one of said multiple reference templates;
(d) calculating a difference d between the period length of said unknown speech pattern and the period length of said selected reference template;
(e) comparing said difference d with a predetermined threshold length ε
1, where said ε
1 is a positive value;
(e-1) when said difference d exceeds the threshold length ε
1, extracting from said unknown speech pattern its partial patterns of about the same length as the period length of said selected reference template, each starting at a different position in said unknown speech pattern; and
(e-2) performing pattern matching between said partial patterns and said selected reference template to detect the distances between them;
(f) determining the smallest one of said detected distances to be the distance between said unknown speech pattern and said selected reference template; and
(g) repeating said steps (c) to (f) for each of said multiple reference templates and outputting, as the result of recognition of said input digital speech signal, the label name of said reference template which provides the smallest one Of the distances between said unknown speech pattern and all of said reference templates.
1 Assignment
0 Petitions
Accused Products
Abstract
In a word speech recognition method which performs pattern matching between unknown speech pattern and multiple reference templates and detects that one of the reference templates which provides the smallest one of distance measures detected between the unknown speech pattern and the reference templates, when the difference d between the speech period length of the unknown speech pattern and the speech period length of a selected reference template exceeds a fixed threshold value ε1, partial patterns are extracted from the unknown speech pattern, each starting at a different position, and the minimum one of the distances obtained by pattern matching between these extracted partial patterns and the selected reference template is determined to be the distance between the selected reference template and the unknown speech pattern. When the difference d is in the range of -ε2 ≦d≦ε1, pattern matching is performed between speech periods of the unknown speech pattern and the reference templates with their variation periods eliminated therefrom at their both ends.
-
Citations
16 Claims
-
1. A word speech recognition method which performs pattern matching between an unknown speech pattern and multiple reference templates and detects that one of said multiple reference templates which corresponds to the smallest one of distance measures between said unknown speech pattern and said multiple reference templates, said method comprising the steps of:
-
(a) analyzing an unknown input digital speech signal for each frame and extracting therefrom a sequence of spectral parameters; (b) detecting start and end points of the speech period of said input digital speech signal and obtaining said sequence of spectral parameters of said input digital speech signal for said speech period as said unknown speech pattern; (c) selecting one of said multiple reference templates; (d) calculating a difference d between the period length of said unknown speech pattern and the period length of said selected reference template; (e) comparing said difference d with a predetermined threshold length ε
1, where said ε
1 is a positive value;(e-1) when said difference d exceeds the threshold length ε
1, extracting from said unknown speech pattern its partial patterns of about the same length as the period length of said selected reference template, each starting at a different position in said unknown speech pattern; and(e-2) performing pattern matching between said partial patterns and said selected reference template to detect the distances between them; (f) determining the smallest one of said detected distances to be the distance between said unknown speech pattern and said selected reference template; and (g) repeating said steps (c) to (f) for each of said multiple reference templates and outputting, as the result of recognition of said input digital speech signal, the label name of said reference template which provides the smallest one Of the distances between said unknown speech pattern and all of said reference templates. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A word speech recognizer which performs pattern matching between an unknown speech pattern and multiple reference templates and detects that one of said multiple reference templates which corresponds to the smallest one of distance measures between said unknown speech pattern and said multiple reference templates, said recognizer comprising:
-
input means for inputting a digital speech signal; speech spectral parameter extracting means for analyzing said digital speech signal for each frame and for extracting therefrom a sequence of speech spectral parameters; speech endpoint detecting means for detecting speech endpoints of the speech period of said digital speech signal on the basis of said sequence of speech spectral parameters outputted from said speech spectral parameter extracting means; unknown speech pattern register means for determining start and end points of the speech period of said unknown speech pattern on the basis of said detected speech endpoints and for storing a sequence of spectral parameters of said speech period as said unknown speech pattern; reference template storage means for prestoring multiple reference templates for speech recognition; period length comparing means for comparing the speech period length of each of said stored multiple reference templates and the speech period length of said unknown speech pattern stored in said unknown speech pattern register means; input pattern extracting means for extracting partial patterns from said unknown speech pattern stored in said unknown speech pattern register means, each starting at a different position, on the basis of the comparison result from said period length comparing means and the output result from said unknown speech pattern register means; pattern matching means for performing pattern matching between each of said multiple partial patterns and said each reference template and for outputting multiple distance measures calculated between them; distance comparing means for comparing said multiple distance measures from said pattern matching and for outputting the smallest distance measure as the distance measure between said unknown speech pattern and said each reference template; and result output means for outputting the label name of said reference template which provides the distance measure decided to be the smallest among those between all of said multiple reference templates and said unknown speech pattern.
-
-
13. A word speech recognition method which performs pattern matching between an unknown speech pattern and multiple reference templates and detects that one of said multiple reference templates which corresponds to the smallest one of distance measures between said unknown speech pattern and said multiple reference templates, said method comprising the steps of:
-
(a) analyzing an unknown input digital speech signal for each frame and for extracting therefrom a sequence of spectral parameters; (b) detecting start and end points of the speech period of said input digital speech signal and obtaining said sequence of spectral parameters of said input digital speech signal for said speech period as said unknown speech pattern; (c) selecting one of said multiple reference templates; (d) performing pattern matching between said unknown speech pattern and said selected reference template over their entire lengths to obtain a first distance between them; (e) extracting a reference template partial period from said selected template, except its start and end segments; (f) extracting a speech pattern partial period from said unknown speech pattern, except its start and end segments; (g) performing pattern matching between said reference template partial period and said speech pattern partial period to obtain a second distance between said unknown speech pattern and said selected reference template; (h) comparing said first and second distances and deciding the smaller one of them to be the distance between said unknown speech pattern and said selected reference template; and (i) repeating said steps (c) to (h) for each of said multiple reference templates and outputting, as the result of recognition of said input digital speech signal, the label name of said reference template which provides the smallest one of the distances between said unknown speech pattern and all of said multiple reference templates. - View Dependent Claims (14, 15)
-
-
16. A word speech recognizer which performs pattern matching between an unknown speech pattern and multiple reference templates and detects that one of said multiple reference templates which corresponds to the smallest one of distance measures between said unknown speech pattern and said multiple reference templates, said recognizer comprising:
-
input means for inputting a digital speech signal; speech spectral parameter extracting means for analyzing said digital speech signal for each frame and for extracting therefrom sequence of speech spectral parameters; speech period detecting means for detecting the speech period of said unknown speech pattern as a first speech period on the basis of said sequence of speech spectral parameters outputted from said speech spectral parameter extracting means and for determining both ends of said first speech period as first speech endpoints; unknown speech pattern register means for storing a sequence of spectral parameters of said first speech period as said unknown speech pattern; unknown pattern partial period determining means for determining second speech endpoints that define a second speech period, by eliminating start and end segments from said first speech period detected by said speech period detecting means; reference template storage means for prestoring multiple reference templates for speech recognition, together with information about first speech endpoints defining their speech periods as first speech periods; reference template partial period determining means for determining second endpoints that define a second speech period, by eliminating start and end segments from said first speech period of each of said multiple reference templates selected from said reference template storage means; switching means for selecting said first and second endpoints of said unknown speech pattern and said each selected reference pattern from said speech period detecting means and said reference template pattern storage means, thereby selecting said first and second speech periods of said unknown speech pattern from said unknown pattern register means and said each selected reference template from said reference template storage means; pattern matching means for performing pattern matching between said first speech periods of said unknown speech pattern and said each selected reference template selected by said switching means to obtain a first distance and for performing pattern matching between said second speech periods of said unknown speech pattern and said each selected reference template selected by said switching means to obtain a second distance; distance comparing means for comparing said first and second distances to determine the smaller one of them to be the distance measure between said unknown speech pattern and said each selected reference template; and result output means for comparing all the distance measures outputted from said distance comparing means as the results of matching of said unknown speech pattern with said multiple reference templates, for determining that one of said multiple reference templates which is decided to provide the smallest distance measure, and for outputting the label name of said determined reference template.
-
Specification