Method and system for continuous speech recognition using voting techniques

US 5,638,486 A
Filed: 10/26/1994
Issued: 06/10/1997
Est. Priority Date: 10/26/1994
Status: Expired due to Term

First Claim

Patent Images

1. In a speech-recognition system having a plurality of classifiers, a method of identifying a spoken sound, comprising the following steps:

(a) receiving a plurality of classifier output signals from the classifiers corresponding to an interval, each of the classifier output signals having been generated according to a polynomial discriminant function;

(b) ranking by magnitude the classifier output signals to produce a rank-order of classifier output signals corresponding to the interval;

(c) weighting each position in the rank-order to transform the classifier output signals into a plurality of weighted values;

(d) repeating steps (a)-(c) for a plurality of intervals, whereby generating a plurality of weighted value sequences, each of the weighted value sequences corresponding to a respective one of the plurality of classifiers;

(e) summing each of the weighted value sequences to generate a voting sum for each classifier; and

(f) identifying the spoken sound by selecting the voting sum having a largest magnitude.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a speech-recognition system having a plurality of classifiers, a voting window includes a sequence of outputs from each of the classifiers. For each classifier, a voting sum is generated corresponding to the voting window. A spoken sound is identified by determining which classifier corresponds to the greatest voting sum.

80 Citations

View as Search Results

21 Claims

1. In a speech-recognition system having a plurality of classifiers, a method of identifying a spoken sound, comprising the following steps:
- (a) receiving a plurality of classifier output signals from the classifiers corresponding to an interval, each of the classifier output signals having been generated according to a polynomial discriminant function;
  
  (b) ranking by magnitude the classifier output signals to produce a rank-order of classifier output signals corresponding to the interval;
  
  (c) weighting each position in the rank-order to transform the classifier output signals into a plurality of weighted values;
  
  (d) repeating steps (a)-(c) for a plurality of intervals, whereby generating a plurality of weighted value sequences, each of the weighted value sequences corresponding to a respective one of the plurality of classifiers;
  
  (e) summing each of the weighted value sequences to generate a voting sum for each classifier; and
  
  (f) identifying the spoken sound by selecting the voting sum having a largest magnitude.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 further comprising the step of:
    - generating a system output which includes a class label representing the spoken sound.
  - 3. The method of claim 1 wherein the polynomial discriminant function has a form ##EQU3## wherein x_j represents a plurality of features;
    - wherein i, j, m and n are integers, y represents a classifier output signal;
      
      wherein w_i represents a coefficient;
      
      wherein g_ji represents an exponent.
  - 4. The method of claim 3, wherein the polynomial discriminant function has the form ##EQU4## wherein a₀ represents a zero-order coefficient, b_i represents a first-order coefficient, and c_ij represents a second-order coefficient.
  - 5. The method of claim 1 wherein the speech-recognition system identifies a plurality of spoken sounds from continuous speech.
  - 6. The method of claim 1 wherein steps (a)-(c) are repeated for three successive intervals.
  - 7. The method of claim 1 wherein the spoken sound is selected from the group consisting of word, syllable, and phoneme.

8. A method for recognizing a spoken sound from continuous speech, comprising the following steps:
- (a) receiving the continuous speech;
  
  (b) sampling the continuous speech, over time, to form a sequence of sample datum which represents the continuous speech;
  
  (c) partitioning the sequence of sample datum into a sequence of data frames, each of the sequence of data frames includes at least two of the sequence of sample datum;
  
  (d) extracting a plurality of features from the sequence of data frames;
  
  (f) forming a sequence of feature frames from the plurality of features;
  
  (g) distributing one of the sequence of feature frames to a plurality of classifiers, each of the classifiers generating a classifier output signal in response thereto according to a polynomial discriminant function, whereby producing a plurality of classifier output signals;
  
  (h) ranking by magnitude the classifier output signals to produce a rank-order of classifier output signals corresponding to the distributed feature frame;
  
  (i) weighting each position in the rank-order to transform the classifier output signals into a plurality of weighted values;
  
  (j) repeating steps (g)-(i) for each feature frame included in the sequence of feature frames, whereby generating a plurality of weighted value sequences, each of the weighted value sequences corresponding to a respective one of the classifiers;
  
  (k) summing each of the weighted value sequences to generate a voting sum for each classifier; and
  
  (l) identifying the spoken sound by selecting the voting sum having a largest magnitude.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The method of claim 8, further comprising the step of:
    - generating a system output which includes a class label representing the spoken sound.
  - 10. The method of claim 8 wherein the polynomial discriminant function has a form ##EQU5## wherein x_j represents the features included in the distributed feature frame;
    - wherein i, j, m and n are integers, wherein y represents the classifier output signal;
      
      wherein w_i represents a coefficient;
      
      wherein g_ji represents an exponent.
  - 11. The method of claim 10, wherein the polynomial discriminant function has the form ##EQU6## wherein a₀ represents a zero-order coefficient, b_i represents a first-order coefficient, and c_ij represents a second-order coefficient.
  - 12. The method of claim 8 wherein the speech-recognition system recognizes a plurality of spoken sounds from the continuous speech.
  - 13. The method of claim 8 wherein the sequence of feature frames consists of three feature frames.
  - 14. The method of claim 8 wherein the spoken sound is selected from the group consisting of word, syllable, and phoneme.

15. A speech-recognition system for identifying a spoken sound and having a plurality of classifiers, comprising:
- receiving means for receiving a plurality of classifier output signals from the classifiers corresponding to an interval, each of the classifier output signals having been generated according to a polynomial discriminant function;
  
  ranking means, associatively coupled to the receiving means, for ranking by magnitude the classifier output signals to produce a rank-order of classifier output signals corresponding to the interval;
  
  weighting means, associatively coupled to the ranking means, for weighting each position in the rank-order to transform the classifier output signals into a plurality of weighted values;
  
  summing means, associatively coupled to the weighting means, for respectively summing a plurality of weighted value sequences to generate a plurality of voting sums, each of the voting sums corresponding to a respective one of the plurality of classifiers; and
  
  identifying means, associatively coupled to the summing means, for identifying the spoken sound by selecting from the plurality of voting sums a voting sum having a largest magnitude the subsequence to produce a voting sum for each of the plurality of classifiers,wherein the receiving means, the defining means, and the weighting means cooperatively function over a plurality of intervals to generate the plurality of weighted value sequences.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The speech-recognition system of claim 15 wherein the identifying means generates a system output which includes a class label representing the spoken sound.
  - 17. The speech-recognition system of claim 15, wherein the polynomial discriminant function has a form ##EQU7## wherein x_j represents a plurality of features;
    - wherein i, j, m and n are integers, y represents a classifier output signal;
      
      wherein w_i represents a coefficient;
      
      wherein g_ji represents an exponent.
  - 18. The method of claim 17, wherein the polynomial discriminant function has the form ##EQU8## wherein a₀ represents a zero-order coefficient, b_i represents a first-order coefficient, and c_ij represents a second-order coefficient.
  - 19. The speech-recognition system of claim 15 wherein the speech-recognition system identifies a plurality of spoken sounds from continuous speech.
  - 20. The speech-recognition system of claim 15 wherein the subsequence includes a sequence of three outputs from one of the plurality of classifiers.
  - 21. The speech-recognition system of claim 15 wherein the spoken sound is selected from the group consisting of word, syllable, and phoneme.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google Technology Holdings LLC (Alphabet Inc.)
Original Assignee
Motorola, Inc. (Motorola Solutions, Inc.)
Inventors
Lindsey, Michael K., Wang, Shay-Ping T.
Primary Examiner(s)
Knepper, David D.

Application Number

US08/329,394
Time in Patent Office

958 Days
Field of Search

395/2, 395/2.12, 395/2.13, 395/2.26, 395/2.45, 395/2.57, 395/2.62, 395/2.63, 395/2.41, 381/41-43
US Class Current

704/236
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 15/10 using distance or distortio...

Method and system for continuous speech recognition using voting techniques

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

80 Citations

21 Claims

Specification

Use Cases

Quick Links

Others

Method and system for continuous speech recognition using voting techniques

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

80 Citations

21 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others