HYBRID SPEECH RECOGNITION
First Claim
Patent Images
1. A method comprising:
- inputting speech data;
recognizing the speech data on a first recognizer into a first recognition result comprising recognized text;
evaluating confidence with respect to the first recognition result; and
if not confident as to at least part of the first recognition result, sending non-confident data to a second recognizer in conjunction with any confidently recognized text of the first recognition result, receiving a second recognition result from the second recognizer, and outputting text corresponding to the speech data based on the first recognition result and the second recognition result.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is a technology by which speech is locally and remotely recognized in a hybrid way. Speech is input and recognized locally, with remote recognition invoked if locally recognized speech data was not confidently recognized. The part of the speech that was not confidently recognized is sent to the remote recognizer, along with any confidently recognized text, which the remote recognizer may use as context data in interpreting the part of the speech data that was sent. Alternative text candidates may be sent instead of corresponding speech to the remote recognizer.
-
Citations
20 Claims
-
1. A method comprising:
-
inputting speech data; recognizing the speech data on a first recognizer into a first recognition result comprising recognized text; evaluating confidence with respect to the first recognition result; and if not confident as to at least part of the first recognition result, sending non-confident data to a second recognizer in conjunction with any confidently recognized text of the first recognition result, receiving a second recognition result from the second recognizer, and outputting text corresponding to the speech data based on the first recognition result and the second recognition result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a memory; and a processor which is operatively coupled to the memory and which executes code stored in the memory, the processor, in response to execution of the code, being configured to; receive input data and to provide the input data for recognition by a local recognizer into local recognition results, process the local recognition results to determine whether the local recognition results meet a confidence criterion, and if not, transmit a combination of confident data and non-confident data corresponding to the local recognition results to a remote recognizer to obtain remote recognition results, and to use the remote recognition results to output a final recognition result corresponding to the input data. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. One or more processor-readable media having processor-executable instructions, which when executed perform steps, comprising:
-
receiving local recognition data from a local speech recognizer based upon input speech, the local recognition data including word sets, each word set comprising one or more words and having an associated confidence score; processing the local recognition data into one or more confidently-recognized word sets and one or more non-confidently-recognized word sets; sending text corresponding to the one or more confidently-recognized word sets, and speech data corresponding to the one or more non-confidently-recognized word sets, to a remote recognizer; receiving remote recognition data from the remote recognizer; and outputting a final recognition result based at least in part on the remote recognition data. - View Dependent Claims (20)
-
Specification