System and method for disambiguating multiple intents in a natural language dialog system
First Claim
1. A method comprising:
- receiving, via an interactive voice recognition system, a user utterance and converting the user utterance to text;
generating multiple intents based on the text;
establishing, via the interactive voice recognition system, a respective confidence score for each intent in the multiple intents, wherein the respective confidence score for each intent is based on how much training data corresponding to the each intent was used to train a spoken language understanding module, wherein more training data corresponds to a higher confidence;
identifying a first intent and a second intent having confidence scores above a threshold, wherein the first intent and the second intent have a highest two confidence scores in the multiple intents and wherein both the first intent and the second intent are meant to be implemented; and
disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog, via the interactive voice recognition system, wherein a user is offered a choice of which intent to process first, and wherein the disambiguating further comprises concatenating a first prompt associated with the first intent and a second prompt associated with the second intent, wherein the first prompt and the second prompt are predefined prompts from predefined prompts associated with intents.
5 Assignments
0 Petitions
Accused Products
Abstract
The present invention addresses the deficiencies in the prior art by providing an improved dialog for disambiguating a user utterance containing more than one intent. The invention comprises methods, computer-readable media, and systems for engaging in a dialog. The method embodiment of the invention relates to a method of disambiguating a user utterance containing at least two user intents. The method comprises establishing a confidence threshold for spoken language understanding to encourage that multiple intents are returned, determining whether a received utterance comprises a first intent and a second intent and, if the received utterance contains the first intent and the second intent, disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog wherein the user is offered a choice of which intent to process first, wherein the user is first presented with the intent of the first or second intents having the lowest confidence score.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving, via an interactive voice recognition system, a user utterance and converting the user utterance to text; generating multiple intents based on the text; establishing, via the interactive voice recognition system, a respective confidence score for each intent in the multiple intents, wherein the respective confidence score for each intent is based on how much training data corresponding to the each intent was used to train a spoken language understanding module, wherein more training data corresponds to a higher confidence; identifying a first intent and a second intent having confidence scores above a threshold, wherein the first intent and the second intent have a highest two confidence scores in the multiple intents and wherein both the first intent and the second intent are meant to be implemented; and disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog, via the interactive voice recognition system, wherein a user is offered a choice of which intent to process first, and wherein the disambiguating further comprises concatenating a first prompt associated with the first intent and a second prompt associated with the second intent, wherein the first prompt and the second prompt are predefined prompts from predefined prompts associated with intents. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; receiving, via an interactive voice recognition system, a user utterance and converting the user utterance to text; generating multiple intents based on the text; establishing, via the interactive voice recognition system, a respective confidence score for each intent in the multiple intents, wherein the respective confidence score for each intent is based on how much training data corresponding to the each intent was used to train a spoken language understanding module, wherein more training data corresponds to a higher confidence; identifying a first intent and a second intent having confidence scores above a threshold, wherein the first intent and the second intent have a highest two confidence scores in the multiple intents and wherein both the first intent and the second intent are meant to be implemented; and disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog, via the interactive voice recognition system, wherein a user is offered a choice of which intent to process first, and wherein the disambiguating further comprises concatenating a first prompt associated with the first intent and a second prompt associated with the second intent, wherein the first prompt and the second prompt are predefined prompts from predefined prompts associated with intents. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
receiving, via an interactive voice recognition system, a user utterance and converting the user utterance to text; generating multiple intents based on the text; establishing, via the interactive voice recognition system, a respective confidence score for each intent in the multiple intents, wherein the respective confidence score for each intent is based on how much training data corresponding to the each intent was used to train a spoken language understanding module, wherein more training data corresponds to a higher confidence; identifying a first intent and a second intent having confidence scores above a threshold, wherein the first intent and the second intent have a highest two confidence scores in the multiple intents and wherein both the first intent and the second intent are meant to be implemented; and disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog, via the interactive voice recognition system, wherein a user is offered a choice of which intent to process first, and wherein the disambiguating further comprises concatenating a first prompt associated with the first intent and a second prompt associated with the second intent, wherein the first prompt and the second prompt are predefined prompts from predefined prompts associated with intents. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification