Communication system

US 9,767,800 B2
Filed: 09/08/2016
Issued: 09/19/2017
Est. Priority Date: 12/09/2014
Status: Active Grant

First Claim

Patent Images

1. A method for controlling the response to spoken language input, comprising:

receiving user data from a device;

receiving a first spoken language input from the device;

identifying tags within the first spoken language input;

searching a knowledge base framework based on the tags and the user data, wherein the knowledge base framework is a database that includes a plurality of entities, attributes, and relationships between the entities and the attributes;

identifying entities, attributes, and relationship within the knowledge base framework that match at least one of the tags and the user data;

creating a state graph based on a portion of the knowledge base framework that includes any matched entities, matched attributes, and identified relationships and based on the tags, wherein the state graph is created at least in part by transforming the portion of the knowledge base framework into a probabilistic model graph by replacing the identified relationships with weighted connections and by assigning a confidence indicator to each node of the state graph;

determining at least one goal based on the state graph; and

sending instructions to perform an action to the device based on the at least one goal, the weighted connections, and the confidence indicators.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for responding to spoken language input or multi-modal input are described herein. More specifically, one or more user intents are determined or inferred from the spoken language input or multi-modal input to determine one or more user goals via a dialogue belief tracking system. The systems and methods disclosed herein utilize the dialogue belief tracking system to perform actions based on the determined one or more user goals and allow a device to engage in human like conversation with a user over multiple turns of a conversation. Preventing the user from having to explicitly state each intent and desired goal while still receiving the desired goal from the device, improves a user'"'"'s ability to accomplish tasks, perform commands, and get desired products and/or services. Additionally, the improved response to spoken language inputs from a user improves user interactions with the device.

8 Citations

20 Claims

1. A method for controlling the response to spoken language input, comprising:
- receiving user data from a device;
  
  receiving a first spoken language input from the device;
  
  identifying tags within the first spoken language input;
  
  searching a knowledge base framework based on the tags and the user data, wherein the knowledge base framework is a database that includes a plurality of entities, attributes, and relationships between the entities and the attributes;
  
  identifying entities, attributes, and relationship within the knowledge base framework that match at least one of the tags and the user data;
  
  creating a state graph based on a portion of the knowledge base framework that includes any matched entities, matched attributes, and identified relationships and based on the tags, wherein the state graph is created at least in part by transforming the portion of the knowledge base framework into a probabilistic model graph by replacing the identified relationships with weighted connections and by assigning a confidence indicator to each node of the state graph;
  
  determining at least one goal based on the state graph; and
  
  sending instructions to perform an action to the device based on the at least one goal, the weighted connections, and the confidence indicators.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, further comprising:
    - receiving a second spoken language input;
      
      identifying additional tags within the second spoken language input;
      
      searching the knowledge base framework based on the additional tags;
      
      identifying additional entities, additional attributes, and additional relationships within the knowledge base framework that match at least some of the additional tags;
      
      updating the state graph based on a second portion of the knowledge base framework that includes any matched additional entities, matched additional attributes, and identified additional relationships, and based on the additional tags, wherein the state graph updates the weighted connections and the confidence indicators based on the second portion of the knowledge base framework and the additional tags to form updated weighted connections and updated confidence indicators;
      
      determining at least one additional goal based on an updated state graph; and
      
      sending additional instructions to perform another action to the device based on the at least one additional goal, the updated weighted connections, and the updated confidence indicators.
  - 3. The method of claim 1, further comprising comparing the confidence indicators and the weighted connections to a threshold.
  - 4. The method of claim 3, further comprising:
    - wherein the action is a request for user feedback about the at least one goal when the confidence indicators and the weighted connections do not meet the threshold;
      
      receiving user feedback about the at least one goal from the device; and
      
      identifying feedback tags within the user feedback in view of the tags identified for the first spoken language input;
      
      searching the knowledge base framework based on the feedback tags;
      
      identifying feedback entities, feedback attributes, and feedback relationships within the knowledge base framework that match at least some of the feedback tags;
      
      updating the state graph based on a second portion of the knowledge base framework that includes any matched feedback entities, matched feedback attributes, and identified feedback relationships, and based on the feedback tags, wherein the state graph updates the weighted connections and the confidence indicators based on the second portion of the knowledge base framework and the feedback tags to form updated weighted connections and updated confidence indicators;
      
      determining at least one additional goal based on an updated state graph; and
      
      sending additional instructions to perform an additional action to the device based on the at least one additional goal, the updated weighted connections, and the updated confidence indicators.
  - 5. The method of claim 3, wherein the action is providing the at least one goal to a user when the confidence indicator for the at least one goal meets the threshold.
  - 6. The method of claim 5, wherein the action is providing information to the user via a spoken language output.
  - 7. The method of claim 1, wherein the determining the at least one goal based on the state graph comprises:
    - classifying patterns of the confidence indicators utilizing machine learning models.
  - 8. The method of claim 1, wherein the user data includes a location of the device and user preferences.
  - 9. The method of claim 1, wherein at least one of the tags include a user intent and a contradictory tag.
  - 10. The method of claim 1, wherein the device is at least one of:
    - a mobile telephone;
      
      a smart phone;
      
      a tablet;
      
      a smart watch;
      
      a wearable computer;
      
      a personal computer;
      
      a desktop computer;
      
      a gaming system; and
      
      a laptop computer.
  - 11. The method of claim 1, wherein the portion includes two separate sections of the knowledge base framework and the state graph includes two separate probabilistic model graphs.
  - 12. The method of claim 1, wherein the state graph includes evidence nodes and edge nodes.
  - 13. The method of claim 1, further comprising:
    - ranking nodes of the state graph based at least on the confidence indicators, wherein the determining the at least one goal is based on the ranking of the nodes.
  - 14. The method of claim 1, wherein the confidence indicator considers a confidence level of a related tag and known user preferences.

15. A system comprising:
- a computing device including a processing unit and a memory, the processing unit implementing a spoken language system and a dialogue state belief tracking system, the spoken language system is operable to;
  
  receive a spoken language input,identify tags within the spoken language input, andcommunicate with the dialogue state belief tracking system; and
  
  wherein the dialogue state belief tracking system is operable to;
  
  communicate with the spoken language system,search a knowledge base framework based on the tags identified by the spoken language system;
  
  identify entities, attributes, and relationships within the knowledge base framework that match at least some of the tags;
  
  create a state graph based on a portion of the knowledge base framework that includes any matched entities, matched attributes, and identified relationships,wherein the state graph is formed by transforming the portion into a probabilistic model graph, andwherein the state graph includes a confidence indicator for each node of the state graph;
  
  rank nodes of the state graph;
  
  determine at least one goal based on the rank of the nodes of the state graph; and
  
  send instructions to perform an action based on the at least one goal.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The system of claim 15, wherein the action is to perform the at least one goal.
  - 17. The system of claim 15, wherein the action is to request user feedback about the at least one goal.
  - 18. The system of claim 15, wherein the dialogue state belief tracking system is further operable to:
    - receive user data,search the knowledge base framework based on the user data;
      
      identify at least one additional entity, additional attribute, and additional relationship that match the user data;
      
      identify a second portion of the knowledge base framework that includes any matched additional entity, matched additional attribute, and matched additional relationship to the user data; and
      
      update the state graph based on the second portion of the knowledge base framework, wherein the second portion is clamped on to the state graph by aligning common nodes.
  - 19. The system of claim 18, wherein the user data is a location of a user device.

20. A computer-readable storage device including computer-executable instructions stored thereon which, when executed by a computing system in a distributed network, cause the computing system to perform a method comprising:
- receiving user data from a device;
  
  receiving a second spoken language input from the device;
  
  identifying tags within the second spoken language input in view of previously determined tags from a first spoken language input in a conversation between a user and the device;
  
  searching a knowledge base framework based on the tags and the user data;
  
  identifying entities, attributes, and relationship within the knowledge base framework that match at least one of the tags and the user data;
  
  creating an updated state graph by aligning a portion of the knowledge base framework that includes any matched entities, matched attributes, and identified relationships with a stored state graph;
  
  determining at least one user goal based on the updated state graph; and
  
  sending instructions to perform an action to the device based on the at least one user goal and a confidence indicator for the at least one user goal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Crook, Paul, Sarikaya, Ruhi
Primary Examiner(s)
Hang, Vu B

Application Number

US15/259,904
Publication Number

US 20160379637A1
Time in Patent Office

376 Days
Field of Search
US Class Current
CPC Class Codes

G06F 40/30   Semantic analysis

G10L 15/02   Feature extraction for spee...

G10L 15/065   Adaptation

G10L 15/08   Speech classification or se...

G10L 15/10   using distance or distortio...

G10L 15/12   using dynamic programming t...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 2015/0635   updating or merging of old ...

G10L 2015/0638   Interactive procedures

G10L 2015/088   Word spotting

G10L 2015/223   Execution procedure of a sp...

G10L 2015/225   Feedback of the input speech

Communication system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

8 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Communication system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

8 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others