Speech recognizer
First Claim
1. In a speech recognizer which operates to match an unknown speech segment comprising a fine sequence of frames with model segments represented by respective fine sequences of states, the respective fine sequences of the unknown segment an a model segment together defining a fine matrix;
- a method of determining a good alignment of the unknown segment with a model segment, said method comprising;
preparing respective coarse sequences representing said unknown speech segment and said model segment thereby to obtain a respective coarse matrix;
determining a best alignment of said coarse sequences thereby determining a coarse path through said coarse matrix;
overlaying said coarse path on said fine matrix and determining which fine matrix locations lie within a preselected metric of said coarse path thereby defining a corridor of possible paths through said fine matrix; and
calculating only transitions within said corridor, determining an alignment of said unknown segment with said model segment.
11 Assignments
0 Petitions
Accused Products
Abstract
In the speech recognizer disclosed herein, alignment of an unknown speech sediment, represented by a finely gradiated sequence of frames, with a model sediment represented by a sequence of states is performed by first preparing respective coarse sequences representing the unknown and model segments thereby to define a coarse matrix representing possible alignments. The fine sequences correspondingly define a fine matrix. A best alignment of the coarse sequences is determined thereby to define a coarse path through the coarse matrix. The coarse path is overlaid on the fine matrix and a corridor is defined which includes fine matrix locations which lie within a preselected metric of the coarse path. Only transitions within the corridor are calculated in determining the fine alignment of the unknown speech segment with the model segment, thereby significantly reducing the number of computations required.
16 Citations
10 Claims
-
1. In a speech recognizer which operates to match an unknown speech segment comprising a fine sequence of frames with model segments represented by respective fine sequences of states, the respective fine sequences of the unknown segment an a model segment together defining a fine matrix;
- a method of determining a good alignment of the unknown segment with a model segment, said method comprising;
preparing respective coarse sequences representing said unknown speech segment and said model segment thereby to obtain a respective coarse matrix; determining a best alignment of said coarse sequences thereby determining a coarse path through said coarse matrix; overlaying said coarse path on said fine matrix and determining which fine matrix locations lie within a preselected metric of said coarse path thereby defining a corridor of possible paths through said fine matrix; and calculating only transitions within said corridor, determining an alignment of said unknown segment with said model segment. - View Dependent Claims (2, 3, 4, 5, 6)
- a method of determining a good alignment of the unknown segment with a model segment, said method comprising;
-
7. In a speech recognizer which operates to match an unknown speech segment comprising a fine sequence of multi-dimensional spectral frames with model segments also represented by a respective fine sequences of states, the respective fine sequences of the unknown segment and a model segment together defining a fine matrix which encompasses all possible alignments of the unknown segment fine sequence with the respective model segment fine sequence;
- a method of determining a good alignment of the unknown segment fine sequence with a model segment fine sequence, said method comprising;
preparing respective coarse sequences representing said unknown speech segment and said model segment thereby to obtain a coarse matrix representing possible alignments of said unknown and model segments; determining a best alignment of said coarse sequences thereby determining a coarse path through said coarse matrix; overlaying said coarse path on said fine matrix and interpolating between the locations defining said coarse path thereby to determine a base path across said fine matrix; determining which fine matrix locations lie within a preselected metric of said coarse path thereby defining a corridor comprising a limited number of possible paths through said fine matrix; and calculating only transitions within said corridor, determining a best alignment of said unknown segment fine sequence with said model segment fine sequence. - View Dependent Claims (8, 9)
- a method of determining a good alignment of the unknown segment fine sequence with a model segment fine sequence, said method comprising;
-
10. In a speech recognizer which operates to match an unknown speech segment comprising a fine sequence of frames with model segments represented by respective fine sequences of states, the respective fine sequences of the unknown segment and a model segment together defining a fine matrix which encompasses all possible alignments of the unknown segment with the respective model segment;
- a method of determining a good alignment of the unknown segment with a model segment, said method comprising;
subsampling the respective fine sequences thereby to obtain respective coarse sequences representing said unknown speech segment and said model segment and thereby to define a coarse matrix representing possible alignments of said unknown and model segments; determining a best alignment of said coarse sequences thereby determining a coarse path through said coarse matrix; overlaying said coarse path on said fine matrix and interpolating between the locations defining said coarse path thereby to determine a base path across said fine matrix; determining which fine matrix locations lie within a preselected number of locations of said base path thereby defining a corridor of possible paths through said fine matrix; and calculating only transitions within said corridor, determining a best fine alignment of said unknown segment with said model segment.
- a method of determining a good alignment of the unknown segment with a model segment, said method comprising;
Specification