System and method for disambiguating multiple intents in a natural language dialog system

US 9,009,046 B1
Filed: 09/27/2005
Issued: 04/14/2015
Est. Priority Date: 09/27/2005
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, via an interactive voice recognition system, a user utterance and converting the user utterance to text;

generating multiple intents based on the text;

establishing, via the interactive voice recognition system, a confidence score for each intent in the multiple intents, wherein the confidence score for each intent is based on how much training data corresponding to the each intent was used to train a spoken language understanding module, where more training data corresponds to a higher confidence;

when only a single intent in the multiple intents has a confidence score above a threshold;

identifying a plurality of call types associated with the multiple intents; and

applying predefined precedence rules to respond to only a single call type in the plurality of call types, the single call type associated with the single intent; and

when multiple intents have confidence scores above the threshold;

identifying a first intent and a second intent based on the confidence scores for the multiple intents, wherein the first intent and the second intent have a highest two confidence scores in the multiple intents; and

disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog, via the interactive voice recognition system, wherein a user is offered a choice of which intent to process first, wherein the user is first presented with one of the first intent and the second intent having a lowest confidence score between the first intent and the second intent.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention addresses the deficiencies in the prior art by providing an improved dialog for disambiguating a user utterance containing more than one intent. The invention comprises methods, computer-readable media, and systems for engaging in a dialog. The method embodiment of the invention relates to a method of disambiguating a user utterance containing at least two user intents. The method comprises establishing a confidence threshold for spoken language understanding to encourage that multiple intents are returned, determining whether a received utterance comprises a first intent and a second intent and, if the received utterance contains the first intent and the second intent, disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog wherein the user is offered a choice of which intent to process first, wherein the user is first presented with the intent of the first or second intents having the lowest confidence score.

Citations

18 Claims

1. A method comprising:
- receiving, via an interactive voice recognition system, a user utterance and converting the user utterance to text;
  
  generating multiple intents based on the text;
  
  establishing, via the interactive voice recognition system, a confidence score for each intent in the multiple intents, wherein the confidence score for each intent is based on how much training data corresponding to the each intent was used to train a spoken language understanding module, where more training data corresponds to a higher confidence;
  
  when only a single intent in the multiple intents has a confidence score above a threshold;
  
  identifying a plurality of call types associated with the multiple intents; and
  
  applying predefined precedence rules to respond to only a single call type in the plurality of call types, the single call type associated with the single intent; and
  
  when multiple intents have confidence scores above the threshold;
  
  identifying a first intent and a second intent based on the confidence scores for the multiple intents, wherein the first intent and the second intent have a highest two confidence scores in the multiple intents; and
  
  disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog, via the interactive voice recognition system, wherein a user is offered a choice of which intent to process first, wherein the user is first presented with one of the first intent and the second intent having a lowest confidence score between the first intent and the second intent.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the disambiguation sub-dialog presents one of the first intent and second intent having a highest confidence score between the first intent and the second intent last.
  - 3. The method of claim 1, further comprising:
    - receiving a disambiguation utterance from the user clarifying which of the first intent and the second intent should be processed first.
  - 4. The method of claim 1, wherein when the received utterance comprises the first intent and the second intent, then disambiguating the first intent and the second intent further comprises concatenating prompts from a table of call types.
  - 5. The method of claim 1, wherein when the received utterance comprises a customer service representative request plus an intent, then disambiguating the received utterance further comprises concatenating prompts from a table of call types.
  - 6. The method of claim 5, wherein when the received utterance comprises a customer service representative request plus the first intent and the second intent, then disambiguating the first intent and the second intent further comprises concatenating prompts from the table, wherein one of the first intent and the second intent having the lowest confidence score between the first intent and the second intent is played first and one of the first intent and the second intent having a highest confidence score between the first intent and the second intent is played last.

7. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- receiving, via an interactive voice recognition system, a user utterance and converting the user utterance to text;
  
  generating multiple intents based on the text;
  
  establishing, via the interactive voice recognition system, a confidence score for each intent in the multiple intents, wherein the confidence score for each intent is based on how much training data corresponding to the each intent was used to train a spoken language understanding module, where more training data corresponds to a higher confidence;
  
  when only a single intent in the multiple intents has a confidence score above a threshold;
  
  identifying a plurality of call types associated with the multiple intents; and
  
  applying predefined precedence rules to respond to only a single call type in the plurality of call types, the single call type associated with the single intent; and
  
  when multiple intents have confidence scores above the threshold;
  
  identifying a first intent and a second intent based on the confidence scores for the multiple intents, wherein the first intent and the second intent have highest two confidence scores in the multiple intents; and
  
  disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog, via the interactive voice recognition system, wherein a user is offered a choice of which intent to process first, wherein the user is first presented with one of the first intent and the second intent having a lowest confidence score between the first intent and the second intent.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The computer-readable storage device of claim 7, wherein the disambiguation sub-dialog presents one of the first intent and second intent having a highest confidence score between the first intent and the second intent last.
  - 9. The computer-readable storage device of claim 7 having additional instructions stored which, when executed by the computing device, result in operations comprising:
    - receiving a disambiguation utterance from the user clarifying which of the first intent and the second intent should be processed first.
  - 10. The computer-readable storage device of claim 7, wherein when the received utterance comprises the first intent and the second intent, then disambiguating the first intent and the second intent further comprises concatenating prompts from a table of call types.
  - 11. The computer-readable storage device of claim 7, when the received utterance comprises a customer service representative request plus an intent, then disambiguating the received utterance further comprises concatenating prompts from a table of call types.
  - 12. The computer-readable storage device of claim 11, wherein when the received utterance comprises a customer service representative request plus the first intent and the second intent, then disambiguating the first intent and the second intent further comprises concatenating prompts from the table, wherein one of the first intent and the second intent having the lowest confidence score between the first intent and the second intent is played first and one of the first intent and the second intent having a highest confidence score between the first intent and the second intent is played last.

13. A system comprising:
- a processor; and
  
  a computer-readable storage device having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  receiving, via an interactive voice recognition system, a user utterance and converting the user utterance to text;
  
  generating multiple intents based on the text;
  
  establishing, via the interactive voice recognition system, a confidence score for each intent in the multiple intents, wherein the confidence score for each intent is based on how much training data corresponding to the each intent was used to train a spoken language understanding module, where more training data corresponds to a higher confidence;
  
  when only a single intent in the multiple intents has a confidence score above a threshold;
  
  identifying a plurality of call types associated with the multiple intents; and
  
  applying predefined precedence rules to respond to only a single call type in the plurality of call types, the single call type associated with the single intent; and
  
  when multiple intents have confidence scores above the threshold;
  
  identifying a first intent and a second intent based on the confidence scores for in the multiple intents, wherein the first intent and the second intent have highest two confidence scores in the multiple intents; and
  
  disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog, via the interactive voice recognition system, wherein a user is offered a choice of which intent to process first, wherein the user is first presented with one of the first intent and the second intent having a lowest confidence score.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The system of claim 13, wherein the disambiguation sub-dialog presents one of the first intent and second intent having a highest confidence score between the first intent and the second intent last.
  - 15. The system of claim 13, the computer-readable storage device having additional instructions stored which, when executed by the processor, result in operations comprising:
    - receiving a disambiguation utterance from the user clarifying which of the first intent and the second intent should be processed first.
  - 16. The system of claim 13, wherein when the received utterance comprises the first intent and the second intent, then disambiguating the first intent and the second intent further comprises concatenating prompts from a table of call types.
  - 17. The system of claim 13, wherein when the received utterance comprises a customer service representative request, then disambiguating the received utterance further comprises concatenating prompts from a table of call types.
  - 18. The system of claim 17, wherein when the received utterance comprises a customer service representative request plus the first intent and the second intent, then disambiguating the first intent and the second intent further comprises concatenating prompts from the table, wherein one of the first intent and the second intent having the lowest confidence score between the first intent and the second intent is played first and one of the first intent and the second intent having a highest confidence score between the first intent and the second intent is played last.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Stewart, Osamuyimen Thompson
Primary Examiner(s)
Serrou, Abdelali

Application Number

US11/235,742
Time in Patent Office

3,486 Days
Field of Search

704/270, 704/270.1, 704/246, 704/257, 704/275, 704/251, 704/235, 704/247, 704/252
US Class Current

704/251
CPC Class Codes

G06F 40/30   Semantic analysis

G10L 15/08   Speech classification or se...

G10L 15/18   using natural language mode...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

System and method for disambiguating multiple intents in a natural language dialog system

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for disambiguating multiple intents in a natural language dialog system

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links