Template generation method in a speech recognition system
First Claim
1. In a speech recognition system, wherein speech is represented by data in frames of equal time intervals, a method for generating a final word template from a plurality of tokens, comprising the steps of:
- (a) forming an interim template representative of at least one token;
(b) generating a time alignment path between said interim template and an additional token;
(c) mapping frames from said interim template and said additional token along said time alignment path onto an averaged time axis; and
(d) combining data associated with said mapped frames to produce composite frames representative of the final word template.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed is a method for generating word templates for a speech recognition system. It is used where speech is represented by data in frames of equal time intervals. The method includes generating an interim template, generating a time alignment path between the interim template and a token, mapping frames from the interim template and the token along the time alignment path onto an averaged time axis, and combining data associated with the mapped frames to produce composite frames representative of the final word template. The method realizes advantages of reduced memory usage and a realistic data average from each contributing averaged word.
-
Citations
29 Claims
-
1. In a speech recognition system, wherein speech is represented by data in frames of equal time intervals, a method for generating a final word template from a plurality of tokens, comprising the steps of:
-
(a) forming an interim template representative of at least one token; (b) generating a time alignment path between said interim template and an additional token; (c) mapping frames from said interim template and said additional token along said time alignment path onto an averaged time axis; and (d) combining data associated with said mapped frames to produce composite frames representative of the final word template. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. In a speech recognition system, wherein speech is represented by data in frames of equal time intervals, a method for generating a final word template from a plurality of tokens, comprising the steps of:
-
(a) forming an interim template representative of at least one token; (b) generating a time alignment path between said interim template and an additional token; (c) weighting the data representing said interim template proportional to the number of tokens said interim template represents; and (d) combining frames from said interim template with frames from said additional token, dependent upon the number of tokens which the interim template represents, to produce output frames representative of the final word template. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. In a speech recognition system, wherein speech is represented by data in frames of equal time intervals, an arrangement for generating a final word template from a plurality of tokens, including:
-
(a) means for forming an interim template representative of at least one token; (b) means for generating a time alignment path between said interim template and an additional token; (c) means for mapping frames from said interim template and said additional token along said time alignment path onto an averaged time axis; and (d) means for combining data associated with said mapped frames to produce composite frames representative of a final word template. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. In a speech recognition system, wherein speech is represented by data in frames of equal time intervals, an arrangement for generating a final word template from a plurality of tokens, including:
-
(a) means for forming an interim template representative of at least one token; (b) means for generating a time alignment path between said interim template and an additional token; (c) means for weighting the data representing said interim template proportional to the number of tokens said interim template represents; and (d) means for combining frames from said interim template with frames from said additional token, dependent upon the number of tokens which the interim template represents, to produce output frames representative of the final word template. - View Dependent Claims (25, 26, 27, 28, 29)
-
Specification