Method and system for continuous speech recognition using voting techniques
First Claim
Patent Images
1. In a speech-recognition system having a plurality of classifiers, a method of identifying a spoken sound, comprising the following steps:
- (a) receiving a plurality of classifier output signals from the classifiers corresponding to an interval, each of the classifier output signals having been generated according to a polynomial discriminant function;
(b) ranking by magnitude the classifier output signals to produce a rank-order of classifier output signals corresponding to the interval;
(c) weighting each position in the rank-order to transform the classifier output signals into a plurality of weighted values;
(d) repeating steps (a)-(c) for a plurality of intervals, whereby generating a plurality of weighted value sequences, each of the weighted value sequences corresponding to a respective one of the plurality of classifiers;
(e) summing each of the weighted value sequences to generate a voting sum for each classifier; and
(f) identifying the spoken sound by selecting the voting sum having a largest magnitude.
4 Assignments
0 Petitions
Accused Products
Abstract
In a speech-recognition system having a plurality of classifiers, a voting window includes a sequence of outputs from each of the classifiers. For each classifier, a voting sum is generated corresponding to the voting window. A spoken sound is identified by determining which classifier corresponds to the greatest voting sum.
80 Citations
21 Claims
-
1. In a speech-recognition system having a plurality of classifiers, a method of identifying a spoken sound, comprising the following steps:
-
(a) receiving a plurality of classifier output signals from the classifiers corresponding to an interval, each of the classifier output signals having been generated according to a polynomial discriminant function; (b) ranking by magnitude the classifier output signals to produce a rank-order of classifier output signals corresponding to the interval; (c) weighting each position in the rank-order to transform the classifier output signals into a plurality of weighted values; (d) repeating steps (a)-(c) for a plurality of intervals, whereby generating a plurality of weighted value sequences, each of the weighted value sequences corresponding to a respective one of the plurality of classifiers; (e) summing each of the weighted value sequences to generate a voting sum for each classifier; and (f) identifying the spoken sound by selecting the voting sum having a largest magnitude. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for recognizing a spoken sound from continuous speech, comprising the following steps:
-
(a) receiving the continuous speech; (b) sampling the continuous speech, over time, to form a sequence of sample datum which represents the continuous speech; (c) partitioning the sequence of sample datum into a sequence of data frames, each of the sequence of data frames includes at least two of the sequence of sample datum; (d) extracting a plurality of features from the sequence of data frames; (f) forming a sequence of feature frames from the plurality of features; (g) distributing one of the sequence of feature frames to a plurality of classifiers, each of the classifiers generating a classifier output signal in response thereto according to a polynomial discriminant function, whereby producing a plurality of classifier output signals; (h) ranking by magnitude the classifier output signals to produce a rank-order of classifier output signals corresponding to the distributed feature frame; (i) weighting each position in the rank-order to transform the classifier output signals into a plurality of weighted values; (j) repeating steps (g)-(i) for each feature frame included in the sequence of feature frames, whereby generating a plurality of weighted value sequences, each of the weighted value sequences corresponding to a respective one of the classifiers; (k) summing each of the weighted value sequences to generate a voting sum for each classifier; and (l) identifying the spoken sound by selecting the voting sum having a largest magnitude. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A speech-recognition system for identifying a spoken sound and having a plurality of classifiers, comprising:
-
receiving means for receiving a plurality of classifier output signals from the classifiers corresponding to an interval, each of the classifier output signals having been generated according to a polynomial discriminant function; ranking means, associatively coupled to the receiving means, for ranking by magnitude the classifier output signals to produce a rank-order of classifier output signals corresponding to the interval; weighting means, associatively coupled to the ranking means, for weighting each position in the rank-order to transform the classifier output signals into a plurality of weighted values; summing means, associatively coupled to the weighting means, for respectively summing a plurality of weighted value sequences to generate a plurality of voting sums, each of the voting sums corresponding to a respective one of the plurality of classifiers; and identifying means, associatively coupled to the summing means, for identifying the spoken sound by selecting from the plurality of voting sums a voting sum having a largest magnitude the subsequence to produce a voting sum for each of the plurality of classifiers, wherein the receiving means, the defining means, and the weighting means cooperatively function over a plurality of intervals to generate the plurality of weighted value sequences. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification