Correcting substitution errors during automatic speech recognition by accepting a second best when first best is confusable

US 8,600,760 B2
Filed: 11/28/2006
Issued: 12/03/2013
Est. Priority Date: 11/28/2006
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition method comprising the steps of:

(a) receiving input speech containing vocabulary via a microphone associated with an automatic speech recognition system;

(b) processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values using at least one processor associated with the automatic speech recognition system;

(c) cross-referencing a first-best hypothesis of the N-best hypotheses against a list of known confusable vocabulary to determine whether the first-best hypothesis of the N-best hypotheses is confusable with any of the known confusable vocabulary;

(d) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the first-best hypothesis is not determined to be confusable with any of the known confusable vocabulary;

(e) comparing at least one parameter value of the first-best hypothesis to at least one threshold value, if the first-best hypothesis is determined to be confusable with any of the known confusable vocabulary;

(f) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the at least one parameter value of the first- best hypothesis is greater than the at least one threshold value;

(g) determining if a second-best hypothesis of the N-best hypotheses is confusable with the first-best hypothesis, if the at least one parameter value of the first-best hypothesis is not greater than the at least one threshold value;

and (h) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the second-best hypothesis is determined to be confusable with the first-best hypothesis;

(h1) determining if a confidence score of the second-best hypothesis is between lower and upper threshold values;

and (i) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the confidence score is determined to be within the lower and upper threshold values.

View all claims

16 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition method includes the steps of receiving input speech containing vocabulary, processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values, and determining whether a first-best hypothesis of the N-best hypotheses is confusable with any vocabulary within the grammar. The first-best hypothesis is accepted as recognized speech corresponding to the received input speech if the first-best hypothesis is not determined to be confusable with any vocabulary within the grammar. Where the first-best hypothesis is determined to be confusable, at least one parameter value of the first-best hypothesis can be compared to at least one threshold value, and accepting the second-best as the recognized speech, if its confidence score is within certain lower and upper threshold values and is not confusable with the first-best. The first-best hypothesis can be accepted as recognized speech corresponding to the received input speech, if the parameter value of the first-best hypothesis is greater than the threshold value.

18 Citations

View as Search Results

12 Claims

1. A speech recognition method comprising the steps of:
- (a) receiving input speech containing vocabulary via a microphone associated with an automatic speech recognition system;
  
  (b) processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values using at least one processor associated with the automatic speech recognition system;
  
  (c) cross-referencing a first-best hypothesis of the N-best hypotheses against a list of known confusable vocabulary to determine whether the first-best hypothesis of the N-best hypotheses is confusable with any of the known confusable vocabulary;
  
  (d) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the first-best hypothesis is not determined to be confusable with any of the known confusable vocabulary;
  
  (e) comparing at least one parameter value of the first-best hypothesis to at least one threshold value, if the first-best hypothesis is determined to be confusable with any of the known confusable vocabulary;
  
  (f) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the at least one parameter value of the first- best hypothesis is greater than the at least one threshold value;
  
  (g) determining if a second-best hypothesis of the N-best hypotheses is confusable with the first-best hypothesis, if the at least one parameter value of the first-best hypothesis is not greater than the at least one threshold value;
  
  and (h) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the second-best hypothesis is determined to be confusable with the first-best hypothesis;
  
  (h1) determining if a confidence score of the second-best hypothesis is between lower and upper threshold values;
  
  and (i) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the confidence score is determined to be within the lower and upper threshold values.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, further comprising the steps of:
    - (i1) setting the first-best hypothesis as recognized speech corresponding to the received input speech, if the second-best hypothesis is not determined to be to be confusable with the first-best hypothesis.
  - 3. The method of claim 2, further comprising the step of:
    - (i2) transmitting a pardon message after setting the first-best hypothesis as the recognized speech.
  - 4. The method of claim 3, further comprising the steps of:
    - (i4) presenting the first-best hypothesis to a user for confirmation;
      
      and (i5) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the first-best hypothesis is confirmed by the user.
  - 5. The method of claim 1, wherein the at least one threshold value includes a plurality of threshold values, each corresponding with an individual hypothesis that is confusable with the first-best hypothesis.
  - 6. The method of claim 1, wherein the at least one threshold value includes threshold values for a given hypothesis that is confusable with the first-best hypothesis, wherein the threshold values vary depending on a grammar being used in the processing step.
  - 7. The method of claim 1, wherein the at least one threshold value includes threshold values for a given hypothesis that is confusable with the first-best hypothesis, wherein the threshold values vary by user.
  - 8. The method of claim 1, wherein the at least one parameter value is a confidence value.

9. A speech recognition method comprising the steps of:
- (a) receiving input speech containing vocabulary via a microphone associated with an automatic speech recognition system;
  
  (b) processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values using at least one processor associated with the automatic speech recognition system;
  
  (c) cross-referencing a first-best hypothesis of the N-best hypotheses against a list of known confusable vocabulary to determine whether the first-best hypothesis of the N-best hypotheses is confusable with any of the known confusable vocabulary;
  
  (d) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the first-best hypothesis is not determined to be confusable with any of the known confusable vocabulary;
  
  (e) comparing at least one parameter value of the first-best hypothesis to at least one threshold value, if the first-best hypothesis is determined to be confusable with any of the known confusable vocabulary; and
  
  (f) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the at least one parameter value of the first-best hypothesis is greater than the at least one threshold value;
  
  (g) determining if a second-best hypothesis of the N-best hypotheses is confusable with the first-best hypothesis, if the at least one parameter value of the first-best hypothesis is not greater than the at least one threshold value;
  
  (h) determining if a confidence score of the second-best hypothesis is between lower and upper threshold values; and
  
  (i) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the confidence score is determined to be within the lower and upper threshold values.
- View Dependent Claims (10, 11, 12)
- - 10. The method of claim 9, further comprising the steps of:
    - (m1) setting the first-best hypothesis as recognized speech corresponding to the received input speech, if the confidence score is not determined to be within the lower and upper threshold values.
  - 11. The method of claim 10, further comprising the step of:
    - (m2) transmitting a pardon message after setting the first-best hypothesis as recognized speech.
  - 12. The method of claim 11, further comprising the steps of:
    - (m4) presenting the first-best hypothesis to a user for confirmation; and
      
      (m5) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the first-best hypothesis is confirmed by the user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
GM Global Technology Operations LLC (General Motors Company)
Original Assignee
General Motors LLC (General Motors Company), GM Global Technology Operations LLC (General Motors Company)
Inventors
Grost, Timothy J., Chengalvarayan, Rathinavelu, Clark, Jason W., Abeska, Edward
Primary Examiner(s)
Smits, Talivaldis Ivars
Assistant Examiner(s)
Kazeminezhad, Farzad

Application Number

US11/563,835
Publication Number

US 20080126100A1
Time in Patent Office

2,562 Days
Field of Search

704/240
US Class Current

704/275
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/20   Speech recognition techniqu...

G10L 15/26   Speech to text systems G10L...

Correcting substitution errors during automatic speech recognition by accepting a second best when first best is confusable

First Claim

16 Assignments

0 Petitions

Accused Products

Abstract

18 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Correcting substitution errors during automatic speech recognition by accepting a second best when first best is confusable

First Claim

16 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links