Speech recognition with parallel recognition tasks

US 9,373,329 B2
Filed: 10/28/2013
Issued: 06/21/2016
Est. Priority Date: 07/02/2008
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, at a computer system, an audio signal;

initiating, by the computer system, a plurality of speech recognition tasks for the audio signal, wherein the speech recognition tasks each use a different one of a plurality of language models;

detecting that a portion of the plurality of speech recognition tasks have completed, wherein a remaining portion of the plurality of speech recognition tasks have not completed;

obtaining recognition results and confidence values for each of the plurality of speech recognition tasks included in the portion, wherein the recognition results identify one or more candidate transcriptions of the audio signal, and the confidence values identify one or more probabilities that the recognition results are correct;

determining, by the computer system, whether at least one of the one or more confidence values is greater than or equal to a threshold confidence value; and

in response to determining that the at least one of the one or more confidence values is greater than or equal to the threshold confidence value and before all of the remaining portion of the plurality of speech recognition tasks have completed, providing a final recognition result for the audio signal based on the recognition results and the one or more confidence values.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS'"'"'s). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS'"'"'s that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

164 Citations

View as Search Results

20 Claims

1. A computer-implemented method comprising:
- receiving, at a computer system, an audio signal;
  
  initiating, by the computer system, a plurality of speech recognition tasks for the audio signal, wherein the speech recognition tasks each use a different one of a plurality of language models;
  
  detecting that a portion of the plurality of speech recognition tasks have completed, wherein a remaining portion of the plurality of speech recognition tasks have not completed;
  
  obtaining recognition results and confidence values for each of the plurality of speech recognition tasks included in the portion, wherein the recognition results identify one or more candidate transcriptions of the audio signal, and the confidence values identify one or more probabilities that the recognition results are correct;
  
  determining, by the computer system, whether at least one of the one or more confidence values is greater than or equal to a threshold confidence value; and
  
  in response to determining that the at least one of the one or more confidence values is greater than or equal to the threshold confidence value and before all of the remaining portion of the plurality of speech recognition tasks have completed, providing a final recognition result for the audio signal based on the recognition results and the one or more confidence values.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The computer-implemented method of claim 1, wherein the language models are each associated with a different one of a plurality of languages.
  - 3. The computer-implemented method of claim 1, wherein the language models each have a different one of a plurality of levels of granularity.
  - 4. The computer-implemented method of claim 1, wherein the language models are each associated with a different one of a plurality of geographic locations.
  - 5. The computer-implemented method of claim 1, wherein the language models each have a different one of a plurality of architectures.
  - 6. The computer-implemented method of claim 1, wherein the language models were each generated based on a different one of a plurality of training procedures.
  - 7. The computer-implemented method of claim 1, wherein:
    - the final recognition result comprises a particular recognition result from the recognition results that was generated by a particular speech recognition task from the portion of the speech recognition tasks, the particular speech recognition task using a particular language model from the plurality of language models, andinformation that identifies the particular speech recognition task or the particular language model is provided with the final recognition result.
  - 8. The computer-implemented method of claim 1, wherein the plurality of speech recognition tasks are initiated by and run on a plurality of speech recognition systems.
  - 9. The computer-implemented method of claim 1, further comprising:
    - in response to determining that the at least one of the one or more confidence values is greater than or equal to the threshold confidence value, aborting the remaining portion of the plurality of speech recognition tasks before the remaining portion of the plurality of speech recognition tasks have completed.
  - 10. The computer-implemented method of claim 1, further comprising:
    - in response to determining that the at least one of the one or more confidence values is greater than or equal to the threshold confidence value, pausing the remaining portion of the plurality of speech recognition tasks before the remaining portion of the plurality of speech recognition tasks have completed.

11. A computer system comprising:
- one or more computing devices;
  
  an interface of the one or more computing devices that is programmed to receive an audio signal; and
  
  a plurality of speech recognition systems that initiate a plurality of speech recognition tasks for the audio signal, wherein the speech recognition tasks each use a different one of a plurality of language models;
  
  a recognition managing module that is programmed to;
  
  detect that a portion of the plurality of speech recognition tasks have completed, wherein a remaining portion of the plurality of speech recognition tasks have not completed,obtain recognition results and confidence values for each of the plurality of speech recognition tasks included in the portion, wherein the recognition results identify one or more candidate transcriptions of the audio signal, and the confidence values identify one or more probabilities that the recognition results are correct,determine whether at least one of the one or more confidence values is greater than or equal to a threshold confidence value, andin response to determining that the at least one of the one or more confidence values is greater than or equal to the threshold confidence value and before all of the remaining portion of the plurality of speech recognition tasks have completed, provide a final recognition result for the audio signal based on the recognition results and the one or more confidence values.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- - 12. The computer system of claim 11, wherein the language models are each associated with a different one of a plurality of languages.
  - 13. The computer system of claim 11, wherein the language models each have a different one of a plurality of levels of granularity.
  - 14. The computer system of claim 11, wherein the language models are each associated with a different one of a plurality of geographic locations.
  - 15. The computer system of claim 11, wherein the language models each have a different one of a plurality of architectures.
  - 16. The computer system of claim 11, wherein the language models were each generated based on a different one of a plurality of training procedures.
  - 17. The computer system of claim 11, wherein:
    - the final recognition result comprises a particular recognition result from the recognition results that was generated by a particular speech recognition task from the portion of the speech recognition tasks, the particular speech recognition task using a particular language model from the plurality of language models, andinformation that identifies the particular speech recognition task or the particular language model is provided with the final recognition result.
  - 18. The computer system of claim 11, wherein the recognition managing module that is further programmed to:
    - in response to determining that the at least one of the one or more confidence values is greater than or equal to the threshold confidence value, abort the remaining portion of the plurality of speech recognition tasks before the remaining portion of the plurality of speech recognition tasks have completed.
  - 19. The computer system of claim 11, wherein the recognition managing module that is further programmed to:
    - in response to determining that the at least one of the one or more confidence values is greater than or equal to the threshold confidence value, pause the remaining portion of the plurality of speech recognition tasks before the remaining portion of the plurality of speech recognition tasks have completed.

20. A computer program product embodied in a computer readable storage device storing instructions that, when executed, cause one or more computing devices to perform operations comprising:
- receiving an audio signal;
  
  initiating a plurality of speech recognition tasks for the audio signal, wherein the speech recognition tasks each use a different one of a plurality of language models;
  
  detecting that a portion of the plurality of speech recognition tasks have completed, wherein a remaining portion of the plurality of speech recognition tasks have not completed;
  
  obtaining recognition results and confidence values for each of the plurality of speech recognition tasks included in the portion, wherein the recognition results identify one or more candidate transcriptions of the audio signal, and the confidence values identify one or more probabilities that the recognition results are correct;
  
  determining whether at least one of the one or more confidence values is greater than or equal to a threshold confidence value; and
  
  in response to determining that the at least one of the one or more confidence values is greater than or equal to the threshold confidence value and before all of the remaining portion of the plurality of speech recognition tasks have completed, providing a final recognition result for the audio signal based on the recognition results and the one or more confidence values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Strope, Brian, Beaufays, Francoise, Siohan, Olivier
Primary Examiner(s)
Sirjani, Fariba

Application Number

US14/064,755
Publication Number

US 20140058728A1
Time in Patent Office

967 Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 15/00   Speech recognition G10L17/0...

G10L 15/01   Assessment or evaluation of...

G10L 15/26   Speech to text systems G10L...

G10L 15/32   Multiple recognisers used i...

Speech recognition with parallel recognition tasks

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

164 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition with parallel recognition tasks

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

164 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links