Contextual language understanding for multi-turn language tasks

US 9,690,776 B2
Filed: 12/01/2014
Issued: 06/27/2017
Est. Priority Date: 12/01/2014
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

at least one processor; and

a memory encoding computer executable instructions which, when executed by at least one processor, perform a method for contextual language understanding, comprising;

receiving at least a first natural language expression and a second natural language expression, wherein each of the first natural language expression and the second natural language expression include at least one of words, terms, and phrases;

determining, using a single-turn model, a first weighted prediction of at least one of a domain classification, intent classification, and slot type of the first natural language expression;

determining, using a multi-turn model, a second weighted prediction of at least one of a domain classification, intent classification, and slot type of the second natural language expression using at least one of the first natural language expression and contextual information; and

performing an action based on the second weighted prediction of the second natural language expression.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems are provided for contextual language understanding. A natural language expression may be received at a single-turn model and a multi-turn model for determining an intent of a user. For example, the single-turn model may determine a first prediction of at least one of a domain classification, intent classification, and slot type of the natural language expression. The multi-turn model may determine a second prediction of at least one of a domain classification, intent classification, and slot type of the natural language expression. The first prediction and the second prediction may be combined to produce a final prediction relative to the intent of the natural language expression. An action may be performed based on the final prediction of the natural language expression.

34 Citations

View as Search Results

20 Claims

1. A system comprising:
- at least one processor; and
  
  a memory encoding computer executable instructions which, when executed by at least one processor, perform a method for contextual language understanding, comprising;
  
  receiving at least a first natural language expression and a second natural language expression, wherein each of the first natural language expression and the second natural language expression include at least one of words, terms, and phrases;
  
  determining, using a single-turn model, a first weighted prediction of at least one of a domain classification, intent classification, and slot type of the first natural language expression;
  
  determining, using a multi-turn model, a second weighted prediction of at least one of a domain classification, intent classification, and slot type of the second natural language expression using at least one of the first natural language expression and contextual information; and
  
  performing an action based on the second weighted prediction of the second natural language expression.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The system of claim 1, further comprising combining the first weighted prediction and the second weighted prediction to produce a final prediction relative to an intent of the second natural language expression.
  - 3. The system of claim 1, wherein the first natural language expression and the second natural language expression are at least one of a spoken language input and a textual input.
  - 4. The system of claim 1, wherein determining the first weighted prediction comprises evaluating the first natural language expression in isolation.
  - 5. The system of claim 4, wherein evaluating the first natural language expression in isolation comprises at least:
    - classifying the first natural language expression into a supported domain of the single-turn model;
      
      classifying the first natural language expression into a supported intent of the single-turn model; and
      
      extracting at least one semantic word from the first natural language expression and filling at least one supported slot type of the single-turn model with the at least one semantic word.
  - 6. The system of claim 1, wherein evaluating the second natural language expression using contextual information comprises at least:
    - classifying the second natural language expression into a supported domain of the single-turn model using contextual information;
      
      classifying the second natural language expression into a supported intent of the single-turn model using contextual information; and
      
      extracting at least one semantic word from the second natural language expression and filling at least one supported slot type of the multi-turn model with the at least one semantic word using contextual information.
  - 7. The system of claim 1, wherein the contextual information includes at least one of information extracted from the first received natural language expression, a response to the first received natural language expression, client context, and knowledge content.
  - 8. The system of claim 1, wherein determining the first weighted prediction comprises calculating a first score indicative of a probability of the first weighted prediction being correct.
  - 9. The system of claim 8, wherein determining the second weighted prediction comprises calculating a second score indicative of a probability of the second weighted prediction being correct.
  - 10. The system of claim 9, wherein combining the first weighted prediction and the second weighted prediction to produce a final prediction comprises combining the first score and the second score.

11. A system comprising:
- a statistical model for receiving at least a first natural language expression and a second natural language expression during a conversational session, wherein each of the first natural language expression and the second natural language expression include at least one of words, terms, and phrases;
  
  a single-turn model for determining a first prediction of at least one of a domain classification, intent classification, and slot type of each of the first natural language expression and the second natural language expression;
  
  a multi-turn model for determining a second prediction of at least one of a domain classification, intent classification, and slot type of each of the first natural language expression and the second natural language expression;
  
  a combination model for combining the first prediction and the second prediction of each of the first natural language expression and the second natural language expression to produce a final prediction relative to an intent of at least the second natural language expression; and
  
  a final model for performing an action based on the final prediction of at least the second natural language expression.
- View Dependent Claims (12, 13, 14, 15, 16, 17)
- - 12. The system of claim 11, wherein performing an action based on the final prediction comprises responding to the second natural language expression.
  - 13. The system of claim 12, wherein responding to the second natural language expression includes an answer to the second natural language expression based on the final prediction of at least the second natural language expression.
  - 14. The system of claim 12, wherein responding to the second natural language expression includes at least one of asking a question and performing a task.
  - 15. The system of claim 11, wherein determining a first prediction for the first natural language expression and the second natural language expression comprises evaluating the first natural language expression and the second natural language expression in isolation.
  - 16. The system of claim 11, wherein determining a second prediction for the first natural language expression and the second natural language expression comprises evaluating the first natural language expression and the second natural language expression using contextual information.
  - 17. The system of claim 16, wherein evaluating the second natural language expression using contextual information comprises evaluating a combination of the first natural language expression, the first prediction for the at least first and second natural language expressions, client context, and knowledge content.

18. One or more computer-readable storage media, having computer-executable instructions which, when executed by at least one processor, perform a method for building a statistical model for contextual language understanding, comprising:
- receiving a first natural language expression, wherein the first natural language expression includes at least one of words, terms, and phrases;
  
  performing a first action based on a first prediction determined by a single-turn model and a second prediction determined by a multi-turn model;
  
  receiving a second natural language expression, wherein the second natural language expression includes at least one of words, terms, and phrases;
  
  evaluating at least the first natural language expression, the first action, the first prediction, the second prediction, and the second natural language expression to generate contextual information;
  
  aggregating the contextual information into the multi-turn model; and
  
  performing a second action based on evaluating at least the first natural language expression, the first action, the first prediction, the second prediction, and the second natural language expression.
- View Dependent Claims (19, 20)
- - 19. The computer-readable storage media of claim 18, wherein the second action is a response to at least one of the first natural language expression and the second natural language expression.
  - 20. The computer-readable storage media of claim 18, wherein the first natural language expression and the second natural language expression are at least one of a spoken language input and a textual input.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Sarikaya, Ruhi, Xu, Puyang, Rochette, Alexandre, Celikyilmaz, Asli
Primary Examiner(s)
Lerner, Martin

Application Number

US14/556,874
Publication Number

US 20160154792A1
Time in Patent Office

939 Days
Field of Search

704 9, 704257, 704275
US Class Current
CPC Class Codes

G06F 16/90332   Natural language query form...

G06F 40/30   Semantic analysis

G06F 40/35   Discourse or dialogue repre...

Contextual language understanding for multi-turn language tasks

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

34 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Contextual language understanding for multi-turn language tasks

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

34 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links