Multiple recognizer speech recognition
First Claim
1. A computer-implemented method comprising:
- receiving (i) a first transcription of a particular utterance from a first computing device and (ii) a second transcription of the particular utterance from a second computing device;
determining a grammatical alignment between the first transcription and the second transcription based on a comparison between the first transcription and the second transcription;
associating each word or phrase within the first transcription and the second transcription with a measure respectively calculated for each word or phrase within the first transcription and the second transcription, the measure corresponding to a likelihood of relevance for each word or phrase within the first transcription and the second transcription;
comparing the measure associated with each word or phrase within the first transcription and the second transcription;
generating a combined transcription from the first transcription and the second transcription that represents the particular utterance based on the comparison of the measure associated with each word or phrase within the first transcription and the second transcription; and
providing the combined transcription as a speech recognizer output of the particular utterance.
2 Assignments
0 Petitions
Accused Products
Abstract
The subject matter of this specification can be embodied in, among other things, a method that includes receiving audio data that corresponds to an utterance, obtaining a first transcription of the utterance that was generated using a limited speech recognizer. The limited speech recognizer includes a speech recognizer that includes a language model that is trained over a limited speech recognition vocabulary that includes one or more terms from a voice command grammar, but that includes fewer than all terms of an expanded grammar. A second transcription of the utterance is obtained that was generated using an expanded speech recognizer. The expanded speech recognizer includes a speech recognizer that includes a language model that is trained over an expanded speech recognition vocabulary that includes all of the terms of the expanded grammar. The utterance is classified based at least on a portion of the first transcription or the second transcription.
-
Citations
19 Claims
-
1. A computer-implemented method comprising:
-
receiving (i) a first transcription of a particular utterance from a first computing device and (ii) a second transcription of the particular utterance from a second computing device; determining a grammatical alignment between the first transcription and the second transcription based on a comparison between the first transcription and the second transcription; associating each word or phrase within the first transcription and the second transcription with a measure respectively calculated for each word or phrase within the first transcription and the second transcription, the measure corresponding to a likelihood of relevance for each word or phrase within the first transcription and the second transcription; comparing the measure associated with each word or phrase within the first transcription and the second transcription; generating a combined transcription from the first transcription and the second transcription that represents the particular utterance based on the comparison of the measure associated with each word or phrase within the first transcription and the second transcription; and providing the combined transcription as a speech recognizer output of the particular utterance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
one or more processors and one or more storage devices storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising; receiving (i) a first transcription of a particular utterance from a first computing device and (ii) a second transcription of the particular utterance from a second computing device; determining a grammatical alignment between the first transcription and the second transcription based on a comparison between the first transcription and the second transcription; associating each word or phrase within the first transcription and the second transcription with a measure respectively calculated for each word or phrase within the first transcription and the second transcription, the measure corresponding to a likelihood of relevance for each word or phrase within the first transcription and the second transcription; comparing the measure associated with each word or phrase within the first transcription and the second transcription; generating a combined transcription from the first transcription and the second transcription that represents the particular utterance based on the comparison of the measure associated with each word or phrase within the first transcription and the second transcription; and providing the combined transcription as a speech recognizer output of the particular utterance. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
17. A non-transitory computer-readable medium storing instructions executable by one or more computers that, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving (i) a first transcription of a particular utterance from a first computing device and (ii) a second transcription of the particular utterance from a second computing device; determining a grammatical alignment between the first transcription and the second transcription based on a comparison between the first transcription and the second transcription; associating each word or phrase within the first transcription and the second transcription with a measure respectively calculated for each word or phrase within the first transcription and the second transcription, the measure corresponding to a likelihood of relevance for each word or phrase within the first transcription and the second transcription; comparing the measure associated with each word or phrase within the first transcription and the second transcription; generating a combined transcription from the first transcription and the second transcription that represents the particular utterance based on the comparison of the measure associated with each word or phrase within the first transcription and the second transcription; and providing the combined transcription as a speech recognizer output of the particular utterance. - View Dependent Claims (18, 19)
-
Specification