Communicating context across different components of multi-modal dialog applications

US 9,361,884 B2
Filed: 03/11/2013
Issued: 06/07/2016
Est. Priority Date: 03/11/2013
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, by a computing system, data generated based on spoken user responses to prompts generated by the computing system that are associated with a dialogue, the data comprising a portion corresponding to one or more of the spoken user responses spoken before a subsequently spoken response of the spoken user responses;

generating, by at least one processor of the computing system, a list of natural language understanding (NLU)-ranked semantic interpretations for the subsequently spoken response;

determining, by the at least one processor and based on the portion corresponding to the one or more of the spoken user responses spoken before the subsequently spoken response, a plurality of key-value pairs corresponding to different possible resolutions for an unresolved anaphora in the subsequently spoken response; and

selecting, by the at least one processor, from amongst the plurality of key-value pairs, and based on a context of the dialogue determined from the one or more of the spoken user responses spoken before the subsequently spoken response, a key-value pair corresponding to a semantic interpretation in the list that resolves the unresolved anaphora in the subsequently spoken response.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A human-machine dialog system is described which has multiple computer-implemented dialog components. A user client delivers output prompts to a human user and receives dialog inputs including speech inputs from the human user. An automatic speech recognition (ASR) engine processes the speech inputs to determine corresponding sequences of representative text words. A natural language understanding (NLU) engine processes the text words to determine corresponding semantic interpretations. A dialog manager (DM) generates the output prompts and responds to the semantic interpretations so as to manage a dialog process with the human user. The dialog components share context information with each other using a common context sharing mechanism such that the operation of each dialog component reflects available context information.

16 Citations

View as Search Results

20 Claims

1. A method comprising:
- receiving, by a computing system, data generated based on spoken user responses to prompts generated by the computing system that are associated with a dialogue, the data comprising a portion corresponding to one or more of the spoken user responses spoken before a subsequently spoken response of the spoken user responses;
  
  generating, by at least one processor of the computing system, a list of natural language understanding (NLU)-ranked semantic interpretations for the subsequently spoken response;
  
  determining, by the at least one processor and based on the portion corresponding to the one or more of the spoken user responses spoken before the subsequently spoken response, a plurality of key-value pairs corresponding to different possible resolutions for an unresolved anaphora in the subsequently spoken response; and
  
  selecting, by the at least one processor, from amongst the plurality of key-value pairs, and based on a context of the dialogue determined from the one or more of the spoken user responses spoken before the subsequently spoken response, a key-value pair corresponding to a semantic interpretation in the list that resolves the unresolved anaphora in the subsequently spoken response.
- View Dependent Claims (2, 3, 4, 5, 6, 18)
- - 2. The method of claim 1, comprising determining, by the at least one processor, that the key-value pair corresponds to the semantic interpretation in the list that resolves the unresolved anaphora based on a determination that the key-value pair comprises:
    - a context type corresponding to the context of the dialogue; and
      
      a context value indicating that the semantic interpretation is context based.
  - 3. The method of claim 1, wherein determining the plurality of key-value pairs comprises determining, for each element of a plurality of elements of the list, a key-value pair comprising a context type and a context value.
  - 4. The method of claim 3, wherein determining the key-value pair comprises determining at least one of a state of the dialogue, an expectation of the dialogue, a focus of the dialogue, or a selection based on the one or more of the spoken user responses.
  - 5. The method of claim 3, comprising determining, by the at least one processor and for each key-value pair of the plurality of key-value pairs, a confidence score indicating a level of confidence that a possible resolution of the different possible resolutions corresponding to the key-value pair resolves the unresolved anaphora.
  - 6. The method of claim 5, comprising re-ranking, by the at least one processor, for each key-value pair of the plurality of key-value pairs, and based on its confidence score, an element of the plurality of elements corresponding to the key-value pair within the list.
  - 18. The method of claim 1, comprising determining, by the at least one processor, that the key-value pair corresponds to the semantic interpretation in the list that resolves the unresolved anaphora based on relative similarity of the semantic interpretations determined by the at least one processor.

7. A system comprising:
- at least one processor; and
  
  a memory comprising instructions that when executed by the at least one processor cause the system to;
  
  receive data generated based on spoken user responses to prompts generated by the system that are associated with a dialogue, the data comprising a portion corresponding to one or more of the spoken user responses spoken before a subsequently spoken response of the spoken user responses;
  
  generate a list of natural language understanding (NLU)-ranked semantic interpretations for the subsequently spoken response;
  
  determine, based on the portion corresponding to the one or more of the spoken user responses spoken before the subsequently spoken response, a plurality of key-value pairs corresponding to different possible resolutions for an unresolved anaphora in the subsequently spoken response; and
  
  select, from amongst the plurality of key-value pairs and based on a context of the dialogue determined from the one or more of the spoken user responses spoken before the subsequently spoken response, a key-value pair corresponding to a semantic interpretation in the list that resolves the unresolved anaphora in the subsequently spoken response.
- View Dependent Claims (8, 9, 10, 11, 12, 19)
- - 8. The system of claim 7, wherein the instructions, when executed by the at least one processor, cause the system to determine that the key-value pair corresponds to the semantic interpretation in the list that resolves the unresolved anaphora based on a determination that the key-value pair comprises:
    - a context type corresponding to the context of the dialogue; and
      
      a context value indicating that the semantic interpretation is context based.
  - 9. The system of claim 7, wherein the instructions, when executed by the at least one processor, cause the system to determine, for each element of a plurality of elements of the list, a key-value pair comprising a context type and a context value.
  - 10. The system of claim 9, wherein the instructions, when executed by the at least one processor, cause the system to determine the key-value pair based on at least one of a state of the dialogue, an expectation of the dialogue, a focus of the dialogue, or a selection based on the one or more of the spoken user responses.
  - 11. The system of claim 9, wherein the instructions, when executed by the at least one processor, cause the system to determine, for each key-value pair of the plurality of key-value pairs, a confidence score indicating a level of confidence that a possible resolution of the different possible resolutions corresponding to the key-value pair resolves the unresolved anaphora.
  - 12. The system of claim 11, wherein the instructions, when executed by the at least one processor, cause the system to re-rank, for each key-value pair of the plurality of key-value pairs and based on its confidence score, an element of the plurality of elements corresponding to the key-value pair within the list.
  - 19. The system of claim 7, wherein the instructions, when executed by the at least one processor, cause the system to determine that the key-value pair corresponds to the semantic interpretation in the list that resolves the unresolved anaphora based on relative similarity of the semantic interpretations determined by the at least one processor.

13. One or more non-transitory computer-readable media comprising instructions that when executed by at least one processor of a computing system cause the computing system to:
- receive data generated based on spoken user responses to prompts generated by the computing system that are associated with a dialogue, the data comprising a portion corresponding to one or more of the spoken user responses spoken before a subsequently spoken response of the spoken user responses;
  
  generate a list of natural language understanding (NLU)-ranked semantic interpretations for the subsequently spoken response;
  
  determine, based on the portion corresponding to the one or more of the spoken user responses spoken before the subsequently spoken response, a plurality of key-value pairs corresponding to different possible resolutions for an unresolved anaphora in the subsequently spoken response; and
  
  select, from amongst the plurality of key-value pairs and based on a context of the dialogue determined from the one or more of the spoken user responses spoken before the subsequently spoken response, a key-value pair corresponding to a semantic interpretation in the list that resolves the unresolved anaphora in the subsequently spoken response.
- View Dependent Claims (14, 15, 16, 17, 20)
- - 14. The one or more non-transitory computer-readable media of claim 13, wherein the instructions, when executed by the at least one processor, cause the computing system to determine that the key-value pair corresponds to the semantic interpretation in the list that resolves the unresolved anaphora based on a determination that the key-value pair comprises:
    - a context type corresponding to the context of the dialogue; and
      
      a context value indicating that the semantic interpretation is context based.
  - 15. The one or more non-transitory computer-readable media of claim 13, wherein the instructions, when executed by the at least one processor, cause the computing system to determine, for each element of a plurality of elements of the list, a key-value pair comprising a context type and a context value.
  - 16. The one or more non-transitory computer-readable media of claim 15, wherein the instructions, when executed by the at least one processor, cause the computing system to determine the key-value pair based on at least one of a state of the dialogue, an expectation of the dialogue, a focus of the dialogue, or a selection based on the one or more of the spoken user responses.
  - 17. The one or more non-transitory computer-readable media of claim 15, wherein the instructions, when executed by the at least one processor, cause the computing system to, for each key-value pair of the plurality of key-value pairs:
    - determine a confidence score indicating a level of confidence that a possible resolution of the different possible resolutions corresponding to the key-value pair resolves the unresolved anaphora; and
      
      re-rank, based on the confidence score, an element of the plurality of elements corresponding to the key-value pair within the list.
  - 20. The one or more non-transitory computer-readable media of claim 13, wherein the instructions, when executed by the at least one processor, cause the computing system to determine that the key-value pair corresponds to the semantic interpretation in the list that resolves the unresolved anaphora based on relative similarity of the semantic interpretations determined by the at least one processor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Gandrabur, Simona, Buist, Eric, Hebert, Matthieu
Primary Examiner(s)
Hang, Vu B

Application Number

US13/793,822
Publication Number

US 20140257793A1
Time in Patent Office

1,184 Days
Field of Search

704/2, 704/3, 704/4, 704/9, 704/10, 704/231, 704/232
US Class Current

1/1
CPC Class Codes

G06F 40/237   Lexical tools

G06F 40/253   Grammatical analysis; Style...

G06F 40/30   Semantic analysis

G06N 5/04   Inference or reasoning models

G10L 15/22   Procedures used during a sp...

G10L 2015/228   of application context

Communicating context across different components of multi-modal dialog applications

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

16 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Communicating context across different components of multi-modal dialog applications

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

16 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links