Multi-dimensional disambiguation of voice commands
First Claim
1. A computer-implemented method comprising:
- obtaining two or more candidate transcriptions of a single voice command;
identifying one or more possible intended actions for each of the two or more candidate transcriptions of the single voice command, including identifying two or more possible intended actions for a particular transcription of the two or more candidate transcriptions of the single voice command;
providing information for display, the information identifying (i) the two or more candidate transcriptions of the single voice command, and (ii) the one or more possible intended actions for each of the two or more transcriptions of the single voice command, including the two or more possible intended actions for the particular transcription;
receiving, by one or more computers, data indicating a selection of a particular possible intended action from among the displayed one or more possible intended actions for each of the two or more transcriptions of the single voice command, and the displayed two or more possible intended actions for the particular transcription; and
invoking the selected particular possible intended action.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing voice commands. In one aspect, a method includes receiving an audio signal at a server, performing, by the server, speech recognition on the audio signal to identify one or more candidate terms that match one or more portions of the audio signal, identifying one or more possible intended actions for each candidate term, providing information for display on a client device, the information specifying the candidate terms and the actions for each candidate term, receiving from the client device an indication of an action selected by a user, where the action was selected from among the actions included in the provided information, and invoking the action selected by the user.
139 Citations
19 Claims
-
1. A computer-implemented method comprising:
-
obtaining two or more candidate transcriptions of a single voice command; identifying one or more possible intended actions for each of the two or more candidate transcriptions of the single voice command, including identifying two or more possible intended actions for a particular transcription of the two or more candidate transcriptions of the single voice command; providing information for display, the information identifying (i) the two or more candidate transcriptions of the single voice command, and (ii) the one or more possible intended actions for each of the two or more transcriptions of the single voice command, including the two or more possible intended actions for the particular transcription; receiving, by one or more computers, data indicating a selection of a particular possible intended action from among the displayed one or more possible intended actions for each of the two or more transcriptions of the single voice command, and the displayed two or more possible intended actions for the particular transcription; and invoking the selected particular possible intended action. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
one or more computers; and a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising; obtaining two or more candidate transcriptions of a single voice command; identifying one or more possible intended actions for each of the two or more candidate transcriptions of the single voice command, including identifying two or more possible intended actions for a particular transcription of the two or more candidate transcriptions of the single voice command; providing information for display, the information identifying (i) the two or more candidate transcriptions of the single voice command, and (ii) the one or more possible intended actions for each of the two or more transcriptions of the single voice command, including the two or more possible intended actions for the particular transcription; receiving data indicating a selection of a particular possible intended action from among the displayed one or more possible intended actions for each of the two or more transcriptions of the single voice command, and the displayed two or more possible intended actions for the particular transcription; and invoking the selected particular possible intended action. - View Dependent Claims (13)
-
-
14. A computer-implemented method comprising:
-
obtaining information specifying two or more displayed candidate transcriptions of a single voice command and one or more displayed possible intended actions for each of the two or more displayed candidate transcriptions of the single voice command, including two or more displayed possible intended actions for a particular transcription of the two or more displayed candidate transcriptions of the single voice command; receiving data indicating a selection of a particular displayed possible intended action from among the one or more possible displayed intended actions for each of the two or more transcriptions of the single voice command and the two or more possible displayed intended actions for the particular transcription; providing data indicating the selection of the particular displayed possible intended action to a server; and invoking the selected particular displayed possible intended action. - View Dependent Claims (15, 16)
-
-
17. A non-transitory computer storage medium encoded with a computer program, the program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations comprising:
-
obtaining information specifying two or more displayed candidate transcriptions of a single voice command and one or more displayed possible intended actions for each of the two or more displayed candidate transcriptions of the single voice command, including two or more displayed possible intended actions for a particular transcription of the two or more displayed candidate transcriptions of the single voice command; receiving data indicating a selection of a particular displayed possible intended action from among the one or more possible displayed intended actions for each of the two or more transcriptions of the single voice command and the two or more possible displayed intended actions for the particular transcription; providing data indicating the selection of the particular displayed possible intended action to a server; and invoking the selected particular displayed possible intended action. - View Dependent Claims (18, 19)
-
Specification