Simultaneous dialogue state management using frame tracking

US 10,431,202 B2
Filed: 06/20/2017
Issued: 10/01/2019
Est. Priority Date: 10/21/2016
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

at least one processor; and

a memory storing instructions that when executed by the at least one processor perform a set of operations comprising;

receiving an input utterance of a current frame of a conversation;

generating, using natural language understanding, a predicted value and a predicted act for the input utterance;

determining, using a first model trained to predict slot types, whether the predicted value is a new value for a slot having a pre-existing value in the current frame;

when it is determined that the predicted value is a new value having a pre-existing value in the current frame, creating a new frame of the conversation;

determining, using a second model trained to predict dialogue acts, whether the predicted act relates to a previous frame of the conversation;

when it is determined that the predicted act relates to a previous frame, generating an association between the current frame and the previous frame of the conversation;

determining whether the predicted act switches to the previous frame of the conversation; and

when it is determined that the predicted act switches to the previous frame of the conversation, switching to the previous frame of the conversation;

wherein at least two frames, selected from the group consisting of the new frame, the current frame, and the previous frame, are retained in memory, thereby tracking multiple states of the conversation simultaneously.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Examples of the present disclosure describe systems and methods relating to conversation state management using frame tracking. In an example, a frame may represent one or more constraints (e.g., parameters, variables, or other information) received from or generated as a result of interactions with a user. Consequently, each frame may represent one or more states of an ongoing conversation. When the user provides new or different information, a new frame may be created to represent the now-current state of the conversation. The previous frame may be retained for later access by what is referred to herein as a “dialog agent,” which is the portion of the system that can search and use previous state-related information. When an utterance is received, a frame to which the utterance relates may be identified. Thus, the dialog agent may track multiple states simultaneously, thereby enabling conversation features that were not previously possible.

29 Citations

20 Claims

1. A system comprising:
- at least one processor; and
  
  a memory storing instructions that when executed by the at least one processor perform a set of operations comprising;
  
  receiving an input utterance of a current frame of a conversation;
  
  generating, using natural language understanding, a predicted value and a predicted act for the input utterance;
  
  determining, using a first model trained to predict slot types, whether the predicted value is a new value for a slot having a pre-existing value in the current frame;
  
  when it is determined that the predicted value is a new value having a pre-existing value in the current frame, creating a new frame of the conversation;
  
  determining, using a second model trained to predict dialogue acts, whether the predicted act relates to a previous frame of the conversation;
  
  when it is determined that the predicted act relates to a previous frame, generating an association between the current frame and the previous frame of the conversation;
  
  determining whether the predicted act switches to the previous frame of the conversation; and
  
  when it is determined that the predicted act switches to the previous frame of the conversation, switching to the previous frame of the conversation;
  
  wherein at least two frames, selected from the group consisting of the new frame, the current frame, and the previous frame, are retained in memory, thereby tracking multiple states of the conversation simultaneously.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The system of claim 1, wherein determining whether the predicted act relates to a previous frame of the conversation comprises determining that the predicted act relates to a plurality of previous frames of the conversation, and wherein generating the association comprises generating an association with each of the plurality of previous frames.
  - 3. The system of claim 1, wherein the first model and the second model are each a subpart of a third model.
  - 4. The system of claim 1, wherein the input utterance is received as part of an oral dialogue.
  - 5. The system of claim 1, wherein the set of operations further comprises:
    - when it is determined that the predicted value is not a new value, continuing the conversation based on the current frame of the conversation.
  - 6. The system of claim 1, wherein the set of operations further comprises:
    - when it is determined that the predicted act does not switch to the previous frame of the conversation, continuing the conversation based on the current frame of the conversation.
  - 7. The system of claim 1, wherein the input utterance is part of a text-based dialogue.

8. A method for dialogue state management, comprising:
- receiving, from a computing device, an input utterance of a current frame of a conversation;
  
  generating, using natural language understanding, a predicted value and a predicted act for the input utterance;
  
  determining, using a first model trained to predict slot types, whether the predicted value is a new value for a slot having a pre-existing value in the current frame;
  
  based on determining that the predicted value is a new value, creating a new frame of the conversation;
  
  determining, using a second model trained to predict dialogue acts, whether the predicted act relates to a previous frame of the conversation;
  
  based on determining that the predicted act relates to a previous frame, generating an association between the current frame and the previous frame of the conversation;
  
  determining whether the predicted act switches to the previous frame of the conversation;
  
  based on determining that the predicted act switches to the previous frame of the conversation, switching to the previous frame of the conversation;
  
  retaining at least two frames in memory, selected from the group consisting of the new frame, the current frame, and the previous frame, thereby tracking multiple states of the conversation simultaneously;
  
  generating, based on the predicted value and the predicted act, a response to the received input utterance; and
  
  providing the generated response to the computing device.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The method of claim 8, wherein determining whether the predicted act relates to a previous frame of the conversation comprises determining that the predicted act relates to a plurality of previous frames of the conversation, and wherein generating the association comprises generating an association with each of the plurality of previous frames.
  - 10. The method of claim 8, further comprising:
    - based on determining that the predicted value is not a new value, continuing the conversation based on the current frame of the conversation.
  - 11. The method of claim 8, further comprising:
    - based on determining that the predicted act does not switch to the previous frame of the conversation, continuing the conversation based on the current frame of the conversation.
  - 12. The method of claim 8, wherein the first model and the second model are subparts of the same model.
  - 13. The method of claim 8, wherein the input utterance is part of a text-based dialogue.

14. A method for dialogue state management, comprising:
- receiving an input utterance of a current frame of a conversation;
  
  generating, using natural language understanding, a predicted value and a predicted act for the input utterance;
  
  determining, using a first model trained to predict slot types, whether the predicted value is a new value for a slot having a pre-existing value in the current frame;
  
  based on determining that the predicted value is a new value, creating a new frame of the conversation;
  
  determining, using a second model trained to predict dialogue acts, whether the predicted act relates to a previous frame of the conversation;
  
  based on determining that the predicted act relates to a previous frame, generating an association between the current frame and the previous frame of the conversationdetermining whether the predicted act switches to the previous frame of the conversation; and
  
  when it is determined that the predicted act switches to the previous frame of the conversation, switching to the previous frame of the conversation, wherein at least two frames, selected from the group consisting of the new frame, the current frame, and the previous frame, are retained in memory, thereby tracking multiple states of the conversation simultaneously.
- View Dependent Claims (15, 16, 17, 18, 19, 20)
- - 15. The method of claim 14, wherein determining whether the predicted act relates to a previous frame of the conversation comprises determining that the predicted act relates to a plurality of previous frames of the conversation, and wherein generating the association comprises generating an association with each of the plurality of previous frames.
  - 16. The method of claim 14, wherein the first model and the second model are each part of the same model.
  - 17. The method of claim 14, wherein the input utterance is received as part of an oral dialogue.
  - 18. The method of claim 14, further comprising:
    - based on determining that the predicted value is not a new value, continuing the conversation based on the current frame of the conversation.
  - 19. The method of claim 14 further comprising:
    - based on determining that the predicted act does not switch to the previous frame of the conversation, continuing the conversation based on the current frame of the conversation.
  - 20. The method of claim 14, wherein the input utterance is part of a text-based dialogue.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Harris, Justin, El Asri, Layla, Fine, Emery, Mehrotra, Rahul, Schulz, Hannes, Sharma, Shikhar, Zumer, Jeremie
Primary Examiner(s)
Thomas-Homescu, Anne L

Application Number

US15/628,421
Publication Number

US 20180226066A1
Time in Patent Office

833 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/90332   Natural language query form...

G06F 40/279   Recognition of textual enti...

G06F 40/35   Discourse or dialogue repre...

G06N 20/00   Machine learning

G06N 5/027   Frames

G10L 15/02   Feature extraction for spee...

Simultaneous dialogue state management using frame tracking

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

29 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Simultaneous dialogue state management using frame tracking

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links