Speech recognition method and apparatus thereof
First Claim
1. A method of speech recognition comprising the steps of:
- segmenting an input speech signal;
deriving a plurality of time-sequential acoustic parameters from the segmented speech signal having characteristics time-normalized at every segment;
forming a first trajectory using a plurality of dots, each dot corresponding to one of said plurality of time-sequential, time-normalized acoustic parameters, to form a time-normalized trajectory of said segmented speech signal;
providing a plurality of registered trajectories representing known speech segments, each of said registered trajectories being represented by a plurality of data;
matching a formed trajectory with one of said registered trajectories; and
producing an indication of the results of the matching; and
characterized by the further steps of;
sampling said formed first trajectory at a predetermined length therealong, thereby producing new data representing said formed trajectory to be matched with said registered trajectories;
determining a length of said formed trajectory;
determining lengths of said registered trajectories;
comparing the length of said formed trajectory with the registered trajectory lengths; and
using the comparison results in said step of matching.
0 Assignments
0 Petitions
Accused Products
Abstract
A speech recognition method and apparatus in which a plurality of time-sequential acoustic parameters are derived from a segmented input speech signal and are used to form a first trajectory that is time normalized. The time-normalized trajectory is sampled at predetermined lengths therealong and the sampling results used to form a new time-normalized trajectory with equally spaced points ("dots" on the graph), which is compared with a plurality of previously registered trajectories until a match is found between the new time-normalized trajectory and one of the registered trajectories, at which time an indication of the results of the matching is produced. A silence acoustic parameter can also be added to the time-sequential acoustic parameters so that the time-normalized trajectory can start and end from a point of silence. In addition, the length of the first trajectory formed from the acoustic parameters derived from the segmented speech signals is determined and used in the matching operation with the registered trajectories, so that the lengths of the trajectories as well as the distances between the formed trajectory and the stored trajectory are also taken into account to increase the recognition ratio.
-
Citations
22 Claims
-
1. A method of speech recognition comprising the steps of:
-
segmenting an input speech signal; deriving a plurality of time-sequential acoustic parameters from the segmented speech signal having characteristics time-normalized at every segment; forming a first trajectory using a plurality of dots, each dot corresponding to one of said plurality of time-sequential, time-normalized acoustic parameters, to form a time-normalized trajectory of said segmented speech signal; providing a plurality of registered trajectories representing known speech segments, each of said registered trajectories being represented by a plurality of data; matching a formed trajectory with one of said registered trajectories; and producing an indication of the results of the matching; and
characterized by the further steps of;sampling said formed first trajectory at a predetermined length therealong, thereby producing new data representing said formed trajectory to be matched with said registered trajectories; determining a length of said formed trajectory; determining lengths of said registered trajectories; comparing the length of said formed trajectory with the registered trajectory lengths; and using the comparison results in said step of matching. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. Apparatus for speech recognition, comprising:
-
means for segmenting an input speech signal; means for producing a plurality of time-sequential acoustic parameters from the segmented speech signal, said parameters lying along a trajectory; means for producing a plurality of speech recognition parameters from said plurality of time-sequential acoustic parameters; means for registering a plurality of speech recognition parameters having respective known speech segment correlations; means for matching said calculated time-sequential speech recognition parameters with previously registered speech recognition parameters; output means for producing an output representing the results of the matching processing in said means for matching, wherein said means for producing a plurality of speech recognition parameters comprises; first calculating means receiving said time-sequential acoustic parameter produced at a first sample time and respective acoustic parameters produced at adjacent sampling times; second calculating means receiving said calculated intervals for calculating said speech recognition parameters by sampling at a series of said calculated intervals for a predetermined length over the trajectory; and
further comprising;means for correcting a matching distance between said time-sequential speech recognition parameter produced from said segmented speech signal and said previously registered speech recognition parameter during operation of said means for matching based on the difference between a series of intervals of said plurality of time-sequential acoustic parameters produced from said segmented speech signal and a series of intervals of said previously registered speech recognition parameters during operation of said means for matching. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification