Using context information to facilitate processing of commands in a virtual assistant
First Claim
1. A computer-implemented method for disambiguating user input to perform a task on a computing device having at least one processor, comprising:
- at an output device, prompting a user for input;
at an input device, receiving spoken user input;
at a processor communicatively coupled to the output device and to the input device, receiving context information from a context source;
at the processor, generating a first plurality of candidate interpretations of the received spoken user input;
at the processor, disambiguating the intent of a word in the first plurality of candidate interpretations based on the context information to generate a second plurality of candidate interpretations, wherein the second plurality of candidate interpretations is a subset of the first plurality of candidate interpretations;
at the processor, sorting the second plurality of candidate interpretations by relevance based on the context information;
at the processor, deriving a representation of user intent based on the sorted second plurality of candidate interpretations;
at the processor, identifying at least one task and at least one parameter for the task, based at least in part on the derived representation of user intent;
at the processor, executing the at least one task using the at least one parameter, to derive a result;
at the processor, generating a dialog response based on the derived result; and
at the output device, outputting the generated dialog response.
1 Assignment
0 Petitions
Accused Products
Abstract
A virtual assistant uses context information to supplement natural language or gestural input from a user. Context helps to clarify the user'"'"'s intent and to reduce the number of candidate interpretations of the user'"'"'s input, and reduces the need for the user to provide excessive clarification input. Context can include any available information that is usable by the assistant to supplement explicit user input to constrain an information-processing problem and/or to personalize results. Context can be used to constrain solutions during various phases of processing, including, for example, speech recognition, natural language processing, task flow processing, and dialog generation.
3560 Citations
95 Claims
-
1. A computer-implemented method for disambiguating user input to perform a task on a computing device having at least one processor, comprising:
-
at an output device, prompting a user for input; at an input device, receiving spoken user input; at a processor communicatively coupled to the output device and to the input device, receiving context information from a context source; at the processor, generating a first plurality of candidate interpretations of the received spoken user input; at the processor, disambiguating the intent of a word in the first plurality of candidate interpretations based on the context information to generate a second plurality of candidate interpretations, wherein the second plurality of candidate interpretations is a subset of the first plurality of candidate interpretations; at the processor, sorting the second plurality of candidate interpretations by relevance based on the context information; at the processor, deriving a representation of user intent based on the sorted second plurality of candidate interpretations; at the processor, identifying at least one task and at least one parameter for the task, based at least in part on the derived representation of user intent; at the processor, executing the at least one task using the at least one parameter, to derive a result; at the processor, generating a dialog response based on the derived result; and at the output device, outputting the generated dialog response. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
-
36. A computer program product for disambiguating user input to perform a task on a computing device having at least one processor, comprising:
-
a non-transitory computer-readable storage medium; and computer program code, encoded on the medium, configured to cause at least one processor communicatively coupled to an output device and to an input device to perform the steps of; causing the output device to prompt a user for input; receiving spoken user input via the input device; receiving context information from a context source; generating a first plurality of candidate interpretations of the received spoken user input; disambiguating the intent of a word in the first plurality of candidate interpretations based on the context information to generate a second plurality of candidate interpretations, wherein the second plurality of candidate interpretations is a subset of the first plurality of candidate interpretations; at the processor, sorting the second plurality of candidate interpretations by relevance based on the context information; at the processor, deriving a representation of user intent based on the sorted second plurality of candidate interpretations; identifying at least one task and at least one parameter for the task, based at least in part on the derived representation of user intent; executing the at least one task using the at least one parameter, to derive a result; generating a dialog response based on the derived result; and causing the output device to output the generated dialog response. - View Dependent Claims (37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64)
-
-
65. A system for disambiguating user input to perform a task, comprising:
-
an output device, configured to prompt a user for input; an input device, configured to receive spoken user input; at least one processor, communicatively coupled to the output device and to the input device, configured to perform the steps of; receiving context information from a context source; generating a first plurality of candidate interpretations of the received spoken user input; disambiguating the intent of a word in the first plurality of candidate interpretations based on the context information to generate a second plurality of candidate interpretations, wherein the second plurality of candidate interpretations is a subset of the first plurality of candidate interpretations; sorting the second plurality of candidate interpretations by relevance based on the context information; deriving a representation of user intent based on the sorted second plurality of candidate interpretations; identifying at least one task and at least one parameter for the task, based at least in part on the derived representation of user intent; executing the at least one task using the at least one parameter, to derive a result; and generating a dialog response based on the derived result. - View Dependent Claims (66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95)
-
Specification