Speech recognition with parallel recognition tasks

US 8,571,860 B2
Filed: 01/25/2013
Issued: 10/29/2013
Est. Priority Date: 07/02/2008
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, at a computer system, an audio signal;

initiating, by the computer system, a plurality of speech recognition tasks for the audio signal, the plurality of speech recognition tasks running on a plurality of speech recognition systems;

detecting that a portion of the plurality of the speech recognition systems have completed their respective speech recognition tasks which comprise a completed portion of the plurality of speech recognition tasks, wherein a remaining portion of the plurality of the speech recognition systems have not completed their respective speech recognition tasks and are still processing their respective speech recognition tasks;

obtaining recognition results and confidence values for the completed portion of the plurality of speech recognition tasks, wherein the recognition results identify one or more candidate representations of the audio signal and the confidence values identify one or more probabilities that the recognition results are correct;

generating one or more combined confidence values for the recognition results based on the recognition results and the confidence values for the completed portion of the plurality of speech recognition tasks;

determining, by the computer system, whether at least one of the one or more combined confidence values is greater than or equal to a threshold confidence value; and

in response to determining that the at least one of the one or more combined confidence values is greater than or equal to the threshold confidence value and before the remaining portion of the plurality of the speech recognition systems have completed their respective speech recognition tasks, providing a final recognition result for the audio signal based on the recognition results and the one or more combined confidence values.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS'"'"'s). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS'"'"'s that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

60 Citations

View as Search Results

20 Claims

1. A computer-implemented method comprising:
- receiving, at a computer system, an audio signal;
  
  initiating, by the computer system, a plurality of speech recognition tasks for the audio signal, the plurality of speech recognition tasks running on a plurality of speech recognition systems;
  
  detecting that a portion of the plurality of the speech recognition systems have completed their respective speech recognition tasks which comprise a completed portion of the plurality of speech recognition tasks, wherein a remaining portion of the plurality of the speech recognition systems have not completed their respective speech recognition tasks and are still processing their respective speech recognition tasks;
  
  obtaining recognition results and confidence values for the completed portion of the plurality of speech recognition tasks, wherein the recognition results identify one or more candidate representations of the audio signal and the confidence values identify one or more probabilities that the recognition results are correct;
  
  generating one or more combined confidence values for the recognition results based on the recognition results and the confidence values for the completed portion of the plurality of speech recognition tasks;
  
  determining, by the computer system, whether at least one of the one or more combined confidence values is greater than or equal to a threshold confidence value; and
  
  in response to determining that the at least one of the one or more combined confidence values is greater than or equal to the threshold confidence value and before the remaining portion of the plurality of the speech recognition systems have completed their respective speech recognition tasks, providing a final recognition result for the audio signal based on the recognition results and the one or more combined confidence values.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The computer-implemented method of claim 1, wherein a particular combined confidence value from the one or more combined confidence values i) corresponds to a particular recognition result from the recognition results and ii) comprises a combination of two or more of the confidence values that correspond to the particular recognition result.
  - 3. The computer-implemented method of claim 2, wherein the combination of the two or more of the confidence values comprises an average of the two or more confidence values.
  - 4. The computer-implemented method of claim 2, further comprising:
    - weighting the combination of the two or more of the confidence values based on a frequency with which the particular recognition result occurs in the recognition results for the completed portion of the plurality of speech recognition tasks.
  - 5. The computer-implemented method of claim 4, wherein the combination of the two or more of the confidence values are weighted by a predetermined weighting factor that is selected, based on the frequency with which the particular recognition result occurs in the recognition results, from among a plurality of predetermined weighting factors.
  - 6. The computer-implemented method of claim 4, wherein the combination of the two or more of the confidence values are weighted further based on a distribution of the confidence values for one or more of the completed portion of the plurality of speech recognition tasks.
  - 7. The computer-implemented method of claim 4, wherein the combination of the two or more of the confidence values are weighted further based on one or more characteristics one or more speech recognition tasks that generated the particular recognition result.
  - 8. The computer-implemented method of claim 7, wherein the one or more characteristics include one or more characteristics selected from group consisting of:
    - one or more overall levels of accuracy for the one or more speech recognition tasks, one or more contextual levels of accuracy within a context for the audio signal for the one or more speech recognition tasks, and one or more temporal levels of accuracy for one or more periods of time for the one or more speech recognition tasks.
  - 9. The computer-implemented method of claim 4, wherein the combination of the two or more of the confidence values are weighted further based on a level of similarity between one or more speech recognition tasks that generated the particular recognition result.
  - 10. The computer-implemented method of claim 4, wherein the combination of the two or more of the confidence values are weighted further based on a rate with which one or more speech recognition tasks that generated the particular recognition result have correctly identified results for audio signals when i) the one or more speech recognition tasks have identified a same recognition result and ii) with confidence values that are within a threshold value of the two or more confidence values.
  - 11. The computer-implemented method of claim 1, further comprising:
    - aborting or pausing the remaining portion of the plurality of speech recognition systems in response to determining that the at least one of the one or more combined confidence values is greater than or equal to the threshold confidence value.
  - 12. The computer-implemented method of claim 11, wherein, in response to determining that the at least one of the one or more combined confidence values is greater than or equal to the threshold confidence value, causing the remaining portion of the plurality of speech recognition systems to abort their respective speech recognition tasks.
  - 13. The computer-implemented method of claim 11, wherein, in response to determining that the at least one of the one or more combined confidence values is greater than or equal to the threshold confidence value, causing the remaining portion of the plurality of speech recognition systems to pause their respective speech recognition tasks.

14. A computer system comprising:
- one or more computing devices;
  
  an interface of the one or more computing devices that is programmed to receive an audio signal; and
  
  a plurality of speech recognition systems that initiate a plurality of speech recognition tasks for the audio signal;
  
  a recognition managing module that is programmed to;
  
  detect that a portion of the plurality of speech recognition systems have completed their respective speech recognition tasks which comprise a completed portion of the plurality of speech recognition tasks, wherein a remaining portion of the plurality of speech recognition systems have not completed their respective speech recognition tasks and are still processing their respective speech recognition tasks,obtain recognition results and confidence values for the completed portion of the plurality of speech recognition tasks, wherein the recognition results identify one or more candidate representations of the audio signal and the confidence values identify one or more probabilities that the recognition results are correct,generate one or more combined confidence values for the recognition results based on the recognition results and the confidence values for the completed portion of the plurality of speech recognition tasks,determine whether at least one of the one or more combined confidence values is greater than or equal to a threshold confidence value, andin response to determining that the at least one of the one or more combined confidence values is greater than or equal to the threshold confidence value and before the remaining portion of the plurality of speech recognition systems have completed their respective speech recognition tasks, provide through the interface a final recognition result for the audio signal based on the recognition results and the one or more combined confidence values.
- View Dependent Claims (15, 16, 17, 18, 19)
- - 15. The computer system of claim 14, wherein a particular combined confidence value from the one or more combined confidence values i) corresponds to a particular recognition result from the recognition results and ii) comprises a combination of two or more of the confidence values that correspond to the particular recognition result.
  - 16. The computer system of claim 15, wherein the combination of the two or more of the confidence values comprises an average of the two or more confidence values.
  - 17. The computer system of claim 15, wherein the recognition managing module is further programmed to:
    - weight the combination of the two or more of the confidence values based on a frequency with which the particular recognition result occurs in the recognition results for the completed portion of the plurality of speech recognition tasks.
  - 18. The computer system of claim 17, wherein the combination of the two or more of the confidence values are weighted by a predetermined weighting factor that is selected, based on the frequency with which the particular recognition result occurs in the recognition results, from among a plurality of predetermined weighting factors.
  - 19. The computer system of claim 17, wherein the combination of the two or more of the confidence values are weighted further based on a distribution of the confidence values for one or more of the plurality of speech recognition tasks in the completed portion of the plurality of speech recognition tasks.

20. A computer program product embodied in a computer readable storage device storing instructions that, when executed, cause one or more computing devices to perform operations comprising:
- receiving an audio signal;
  
  initiating a plurality of speech recognition tasks for the audio signal, the plurality of speech recognition tasks running on a plurality of speech recognitions systems;
  
  detecting that a portion of the plurality of the speech recognition systems have completed their respective speech recognition tasks and have yielded a completed portion of the plurality of speech recognition tasks, wherein a remaining portion of the plurality of the speech recognition systems have not completed their respective speech recognition tasks and are still processing their respective speech recognition tasks;
  
  obtaining recognition results and confidence values for the completed portion of the plurality of speech recognition tasks, wherein the recognition results identify one or more candidate representations of the audio signal and the confidence values identify one or more probabilities that the recognition results are correct;
  
  generating one or more combined confidence values for the recognition results based on the recognition results and the confidence values for the completed portion of the plurality of speech recognition tasks;
  
  determining whether at least one of the one or more combined confidence values is greater than or equal to a threshold confidence value; and
  
  in response to determining that the at least one of the one or more combined confidence values is greater than or equal to the threshold confidence value and before the remaining portion of the plurality of speech recognition systems have completed their respective speech recognition tasks, providing a final recognition result for the audio signal based on the recognition results and the one or more combined confidence values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Strope, Brian, Beaufays, Francoise, Siohan, Olivier
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
Sirjani, Fariba

Application Number

US13/750,807
Publication Number

US 20130138440A1
Time in Patent Office

277 Days
Field of Search

704/240
US Class Current

704/231
CPC Class Codes

G10L 15/00   Speech recognition G10L17/0...

G10L 15/01   Assessment or evaluation of...

G10L 15/26   Speech to text systems G10L...

G10L 15/32   Multiple recognisers used i...

Speech recognition with parallel recognition tasks

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

60 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition with parallel recognition tasks

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

60 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links