Speech recognition with parallel recognition tasks
First Claim
1. A computer-implemented method comprising:
- providing particular audio data to each automated speech recognizer of a set of automated speech recognizers;
before all of the automated speech recognizers have output a respective hypothesis for the particular audio data, determining that a particular automated speech recognizer of the set of automated speech recognizers has output a hypothesis for the particular audio data, and that a confidence value associated with the hypothesis that is output by the particular automated speech recognizer satisfies a particular confidence value threshold; and
while at least one of the automated speech recognizers that has been provided the particular audio data is indicated as not yet finished generating a respective hypothesis for the particular audio data, and in response to determining that the particular automated speech recognizer of the set of automated speech recognizers has output the hypothesis for the particular audio data, and that the confidence value associated with the hypothesis that is output by the particular automated speech recognizer satisfies the particular confidence value threshold;
providing the hypothesis that is output by the particular automated speech recognizer, of the set of automated speech recognizers, as a top speech recognition hypothesis; and
transmitting a command to stop the at least one of the automated speech recognizers of the set of automated speech recognizers that has been provided the particular audio data and that is indicated as not yet finished generating the respective hypothesis for the particular audio data from finishing generating the respective hypothesis for the particular audio data.
2 Assignments
0 Petitions
Accused Products
Abstract
The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS'"'"'s). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS'"'"'s that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.
107 Citations
17 Claims
-
1. A computer-implemented method comprising:
-
providing particular audio data to each automated speech recognizer of a set of automated speech recognizers; before all of the automated speech recognizers have output a respective hypothesis for the particular audio data, determining that a particular automated speech recognizer of the set of automated speech recognizers has output a hypothesis for the particular audio data, and that a confidence value associated with the hypothesis that is output by the particular automated speech recognizer satisfies a particular confidence value threshold; and while at least one of the automated speech recognizers that has been provided the particular audio data is indicated as not yet finished generating a respective hypothesis for the particular audio data, and in response to determining that the particular automated speech recognizer of the set of automated speech recognizers has output the hypothesis for the particular audio data, and that the confidence value associated with the hypothesis that is output by the particular automated speech recognizer satisfies the particular confidence value threshold; providing the hypothesis that is output by the particular automated speech recognizer, of the set of automated speech recognizers, as a top speech recognition hypothesis; and transmitting a command to stop the at least one of the automated speech recognizers of the set of automated speech recognizers that has been provided the particular audio data and that is indicated as not yet finished generating the respective hypothesis for the particular audio data from finishing generating the respective hypothesis for the particular audio data. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
one or more computing devices; an interface of the one or more computing devices that is programmed to receive an audio signal; a set of automated speech recognizers; and a processor that is configured to perform operations comprising; providing particular audio data to each automated speech recognizer of a set of automated speech recognizers; before all of the automated speech recognizers have output a respective hypothesis for the particular audio data, determining that a particular automated speech recognizer of the set of automated speech recognizers has output a hypothesis for the particular audio data, and that a confidence value associated with the hypothesis that is output by the particular automated speech recognizer satisfies a particular confidence value threshold; and while at least one of the automated speech recognizers that has been provided the particular audio data is indicated as not yet finished generating a respective hypothesis for the particular audio data, and in response to determining that the particular automated speech recognizer of the set of automated speech recognizers has output the hypothesis for the particular audio data, and that the confidence value associated with the hypothesis that is output by the particular automated speech recognizer satisfies the particular confidence value threshold; providing the hypothesis that is output by the particular automated speech recognizer, of the set of automated speech recognizers, as a top speech recognition hypothesis; and transmitting a command to stop the at least one of the automated speech recognizers of the set of automated speech recognizers that has been provided the particular audio data and that is indicated as not yet finished generating the respective hypothesis for the particular audio data from finishing generating the respective hypothesis for the particular audio data. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable medium storing instructions executable by one or more processors which, upon such execution, cause the one or more processors to perform operations comprising:
-
providing particular audio data to each automated speech recognizer of a set of automated speech recognizers; before all of the automated speech recognizers have output a respective hypothesis for the particular audio data, determining that a particular automated speech recognizer of the set of automated speech recognizers has output a hypothesis for the particular audio data, and that a confidence value associated with the hypothesis that is output by the particular automated speech recognizer satisfies a particular confidence value threshold; and while at least one of the automated speech recognizers that has been provided the particular audio data is indicated as not yet finished generating a respective hypothesis for the particular audio data, and in response to determining that the particular automated speech recognizer of the set of automated speech recognizers has output the hypothesis for the particular audio data, and that the confidence value associated with the hypothesis that is output by the particular automated speech recognizer satisfies the particular confidence value threshold; providing the hypothesis that is output by the particular automated speech recognizer, of the set of automated speech recognizers, as a top speech recognition hypothesis; and transmitting a command to stop the at least one of the automated speech recognizers of the set of automated speech recognizers that has been provided the particular audio data and that is indicated as not yet finished generating the respective hypothesis for the particular audio data from finishing generating the respective hypothesis for the particular audio data. - View Dependent Claims (14, 15, 16, 17)
-
Specification