INTEGRATION OF EMBEDDED AND NETWORK SPEECH RECOGNIZERS
First Claim
1. A computer-implemented method comprising:
- receiving a first audio data corresponding to a first user utterance;
obtaining, by a first speech recognizer, a transcription of the first user utterance and a speech recognition confidence value associated with the transcription of the first user utterance;
based on determining that the speech recognition confidence value fails to meet a threshold value, transmitting the first audio data to a server-based speech recognizer;
receiving, from a server, several search results associated with a second transcription of the first audio data, the second transcription of the first audio data being generated by the server-based speech recognizer;
presenting one or more of the search results to a user;
receiving a user selection of a particular search result from among the several search results; and
storing the transcription of the first user utterance in association with the data identifying the particular search result.
2 Assignments
0 Petitions
Accused Products
Abstract
A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
-
Citations
18 Claims
-
1. A computer-implemented method comprising:
-
receiving a first audio data corresponding to a first user utterance; obtaining, by a first speech recognizer, a transcription of the first user utterance and a speech recognition confidence value associated with the transcription of the first user utterance; based on determining that the speech recognition confidence value fails to meet a threshold value, transmitting the first audio data to a server-based speech recognizer; receiving, from a server, several search results associated with a second transcription of the first audio data, the second transcription of the first audio data being generated by the server-based speech recognizer; presenting one or more of the search results to a user; receiving a user selection of a particular search result from among the several search results; and storing the transcription of the first user utterance in association with the data identifying the particular search result. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving a first audio data corresponding to a first user utterance; obtaining, by a first speech recognizer, a transcription of the first user utterance and a speech recognition confidence value associated with the transcription of the first user utterance; based on determining that the speech recognition confidence value fails to meet a threshold value, transmitting the first audio data to a server-based speech recognizer; receiving, from a server, several search results associated with a second transcription of the first audio data, the second transcription of the first audio data being generated by the server-based speech recognizer; presenting the several search results to a user; receiving a user selection of a particular search result from among the several search results; and storing the transcription of the first user utterance in association with the data identifying the particular search result. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-readable storage device storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving a first audio data corresponding to a first user utterance; obtaining, by a first speech recognizer, a transcription of the first user utterance and a speech recognition confidence value associated with the transcription of the first user utterance; based on determining that the speech recognition confidence value fails to meet a threshold value, transmitting the first audio data to a server-based speech recognizer; receiving, from a server, several search results associated with a second transcription of the first audio data, the second transcription of the first audio data being generated by the server-based speech recognizer; presenting the several search results to a user; receiving a user selection of a particular search result from among the several search results; and storing the transcription of the first user utterance in association with the data identifying the particular search result. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification