Integration of embedded and network speech recognizers
First Claim
1. A method, comprising:
- receiving, at a client device, an audio stream that defines a voice command;
defining, using a first speech recognizer module stored at the client device, a first machine-readable voice command based at least in part on the audio stream;
receiving a first query result responsive to a first query sent to a client database, the first query including the first machine-readable voice command;
transmitting the audio stream to a remote server device such that the remote server device defines a second machine-readable voice command using a second speech recognizer module, the second machine-readable voice command being based at least in part on the audio stream;
receiving a second query result from the remote server device, the second query result being responsive to the transmitted audio stream;
displaying the first query result at a display of the client device, the displayed first query result including at least a first selectable result item;
displaying the second query result at the display of the client device, the displayed second query result including at least a second selectable result item, wherein the display of the first query result is not dependent upon the display of the second query result, and the display of the second query result is not dependent upon the display of the first query result;
storing at least a portion of the first query result and the second query result at a memory of the client device;
receiving, at the client device, a second audio stream that defines a subsequent voice command;
defining, using the first speech recognizer module, a third machine-readable voice command based at least in part on the subsequent voice command;
determining that the third machine-readable voice command is substantially similar to the first machine-readable voice command;
retrieving, from the memory of the client device, the stored first query result and the stored second query result when the third machine-readable voice command is determined to be substantially similar to the first machine-readable voice command;
transmitting the second audio stream associated with the subsequent voice command to the remote server device such that the remote server device defines a fourth machine-readable voice command using the second speech recognizer module, the fourth machine-readable voice command being based at least in part on the second audio stream;
receiving a third query result from the remote server device, the third query result being responsive to the transmitted second audio stream, and wherein the third query result is an updated version of the second query result; and
displaying the retrieved first query result, the retrieved second query result, and the third query result at the display of the client device.
2 Assignments
0 Petitions
Accused Products
Abstract
A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
46 Citations
14 Claims
-
1. A method, comprising:
-
receiving, at a client device, an audio stream that defines a voice command; defining, using a first speech recognizer module stored at the client device, a first machine-readable voice command based at least in part on the audio stream; receiving a first query result responsive to a first query sent to a client database, the first query including the first machine-readable voice command; transmitting the audio stream to a remote server device such that the remote server device defines a second machine-readable voice command using a second speech recognizer module, the second machine-readable voice command being based at least in part on the audio stream; receiving a second query result from the remote server device, the second query result being responsive to the transmitted audio stream; displaying the first query result at a display of the client device, the displayed first query result including at least a first selectable result item; displaying the second query result at the display of the client device, the displayed second query result including at least a second selectable result item, wherein the display of the first query result is not dependent upon the display of the second query result, and the display of the second query result is not dependent upon the display of the first query result; storing at least a portion of the first query result and the second query result at a memory of the client device; receiving, at the client device, a second audio stream that defines a subsequent voice command; defining, using the first speech recognizer module, a third machine-readable voice command based at least in part on the subsequent voice command; determining that the third machine-readable voice command is substantially similar to the first machine-readable voice command; retrieving, from the memory of the client device, the stored first query result and the stored second query result when the third machine-readable voice command is determined to be substantially similar to the first machine-readable voice command; transmitting the second audio stream associated with the subsequent voice command to the remote server device such that the remote server device defines a fourth machine-readable voice command using the second speech recognizer module, the fourth machine-readable voice command being based at least in part on the second audio stream; receiving a third query result from the remote server device, the third query result being responsive to the transmitted second audio stream, and wherein the third query result is an updated version of the second query result; and displaying the retrieved first query result, the retrieved second query result, and the third query result at the display of the client device. - View Dependent Claims (2, 3, 4, 5, 12)
-
-
6. A non-transitory processor-readable medium storing code representing instructions that when executed cause a processor of a client device to:
-
receive an audio stream that defines a voice command; define, using a first speech recognizer module stored at the client device, a first machine-readable voice command based at least in part on the audio stream; send a first query to a client database, the first query being based at least in part on the first machine-readable voice command; receive a first query result responsive to the first query sent to the client database, the first query result including a list of M selectable result items, where M is a whole number; transmit the audio stream to a remote server device such that the remote server device defines a second machine-readable voice command using a second speech recognizer module, the second machine-readable voice command being based at least in part on the audio stream; receive a second query result from the remote server device, the second query result being responsive to the transmitted audio stream and including a list of N selectable result items, where N is a whole number; output the first query result including the list of M selectable result items for display on the client device; output the second query result including the list of N selectable result items for display on the client device, wherein the output of the first query result is not dependent upon the output of the second query result, and the output of the second query result is not dependent upon the output of the first query result; initiate at least a portion of the first query result and the second query result to be stored at a memory of the client device; receive a second audio stream that defines a subsequent voice command; define, using the first speech recognizer module, a third machine-readable voice command based at least in part on the subsequent voice command; determine that the third machine-readable voice command is substantially similar to the first machine-readable voice command; retrieve, from the memory of the client device, the stored first query result and the stored second query result when the third machine-readable voice command is determined to be substantially similar to the first machine-readable voice command; transmit the second audio stream associated with the subsequent voice command to the remote server device such that the remote server device defines a fourth machine-readable voice command using the second speech recognizer module, the fourth machine-readable voice command being based at least in part on the second audio stream; receive a third query result from the remote server device, the third query result being responsive to the transmitted second audio stream and wherein the third query result is an updated version of the second query result and including a list of P selectable result items, where P is a whole number; and output the third query result for display at the client device. - View Dependent Claims (7, 8, 13)
-
-
9. A system, comprising:
-
a first speech recognizer module, stored at a client device, configured to; receive an audio stream that defines a voice command; define a first machine-readable voice command based at least in part on the audio stream; receive a second audio stream that defines a subsequent voice command; define a third machine-readable voice command based at least in part on the subsequent voice command; a client query manager configured to; receive a first query result including a first number of selectable result items responsive to sending a first query to a client database, the first query being based at least in part on the first machine-readable voice command; transmit the audio stream to a remote server device such that the remote server device defines a second machine-readable voice command using a second speech recognizer module, the second machine-readable voice command being based at least in part on the audio stream; receive a second query result including a second number of selectable result items from the remote server device, the second query result being responsive to the transmitted audio stream; determine that the third machine-readable voice command is substantially similar to the first machine-readable voice command; retrieve, from the memory of the client device, the stored first query result and the stored second query result from the storage device when the third machine-readable voice command is substantially similar to the first machine-readable voice command; transmit the second audio stream associated with the subsequent voice command to the remote server device such that the remote server device defines a fourth machine-readable voice command using the second speech recognizer module, the fourth machine-readable voice command being based at least in part on the second audio stream; and receive a third query result including a third number of selectable result items from the remote server device, the third query result being responsive to the transmitted second audio stream and wherein the third query result is an updated version of the second query result; a display device configured to; display on the client device the first number of selectable result items corresponding to the first query result and display on the client device the second number of selectable result items corresponding to the second query result, wherein the display of the first number of selectable result items is not dependent upon the display of the second number of selectable result items, and the display of the second number of selectable result items is not dependent upon the display of the first number of selectable result items; and display on the client device the first number of selectable result items corresponding to the retrieved first query result, the second number of selectable result items corresponding to the retrieved second query result, and the third number of selectable result items corresponding to the third query result; a microphone configured to receive the audio stream of the voice command and to provide the audio stream to the first speech recognizer module; and a storage device configured to store at least a portion of the first query result and the second query result at a memory of the client device. - View Dependent Claims (10, 11, 14)
-
Specification