System and method for dialog modeling

US 9,129,601 B2
Filed: 11/26/2008
Issued: 09/08/2015
Est. Priority Date: 11/26/2008
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

training a plurality of hierarchical, parsed-based dialog models comprising a shift-reduce model, a start-complete model, and a connection path model, wherein the plurality of hierarchical, parsed-based dialog models operate incrementally from left to right and only analyze an immediately preceding dialog context;

parsing, via a processor, spoken dialogs with a hierarchical, parse-based dialog model from the plurality of hierarchical, parsed-based dialog models, to yield parsed spoken dialogs, wherein the spoken dialogs are annotated to indicate dialog acts, feature vectors, and task/subtask information;

constructing a functional task structure of the parsed spoken dialogs, wherein the functional task structure does not comprise a rhetorical structure of the parsed spoken dialogs;

predicting a likely next dialog act using the functional task structure, the feature vectors, and the hierarchical, parsed-based dialog model; and

selecting a language model for a next utterance based on the likely next dialog act.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, computer-implemented methods, and computer-readable media for dialog modeling. The method includes receiving spoken dialogs annotated to indicate dialog acts and task/subtask information, parsing the spoken dialogs with a hierarchical, parse-based dialog model which operates incrementally from left to right and which only analyzes a preceding dialog context to generate parsed spoken dialogs, and constructing a functional task structure of the parsed spoken dialogs. The method can further either interpret user utterances with the functional task structure of the parsed spoken dialogs or plan system responses to user utterances with the functional task structure of the parsed spoken dialogs. The parse-based dialog model can be a shift-reduce model, a start-complete model, or a connection path model.

46 Citations

View as Search Results

20 Claims

1. A method comprising:
- training a plurality of hierarchical, parsed-based dialog models comprising a shift-reduce model, a start-complete model, and a connection path model, wherein the plurality of hierarchical, parsed-based dialog models operate incrementally from left to right and only analyze an immediately preceding dialog context;
  
  parsing, via a processor, spoken dialogs with a hierarchical, parse-based dialog model from the plurality of hierarchical, parsed-based dialog models, to yield parsed spoken dialogs, wherein the spoken dialogs are annotated to indicate dialog acts, feature vectors, and task/subtask information;
  
  constructing a functional task structure of the parsed spoken dialogs, wherein the functional task structure does not comprise a rhetorical structure of the parsed spoken dialogs;
  
  predicting a likely next dialog act using the functional task structure, the feature vectors, and the hierarchical, parsed-based dialog model; and
  
  selecting a language model for a next utterance based on the likely next dialog act.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the shift-reduce model has a stack and a tree which (a) shifts each utterance onto the stack, (b) inspects the stack, and (c) based on the stack inspection, performs a reduce action that creates subtrees in the tree.
  - 3. The method of claim 1, wherein the start-complete model uses a stack to maintain a global parse state and produces a dialog task structure directly without producing an equivalent tree.
  - 4. The method of claim 1, wherein the connection path model does not use a stack to maintain a global parse state, and wherein the connection path model (a) directly predicts a connection path from a root to a terminal for each received spoken dialog, and (b) creates a parse tree representing the connection path for each received spoken dialog.
  - 5. The method of claim 1, further comprising:
    - incrementally receiving user utterances as a dialog progresses;
      
      assigning a dialog act to a current user utterance based on the functional task structure of the parsed spoken dialogs;
      
      assigning a subtask label to the current user utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a system subtask label for a next system utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a system dialog act for a next system utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a next subtask label for a next user utterance based on the functional task structure of the parsed spoken dialogs; and
      
      predicting a next dialog act for a next user utterance based on the functional task structure of the parsed spoken dialogs.
  - 6. The method of claim 5, wherein interpreting and predicting are modeled as maximum entropy classifiers which select dialog acts or subtask labels from a pre-selected list.
  - 7. The method of claim 5, further comprising measuring dialog efficiency at different dialog stages.

8. A system comprising:
- a processor; and
  
  a non-transitory computer-readable storage medium having instructions stored, which when executed on the processor, cause the processor to perform operations comprising;
  
  training a plurality of hierarchical, parsed-based dialog models comprising a shift-reduce model, a start-complete model, and a connection path model, wherein the plurality of hierarchical, parsed-based dialog models operate incrementally from left to right and only analyze an immediately preceding dialog context;
  
  parsing, via a processor, spoken dialogs with a hierarchical, parse-based dialog model from the plurality of hierarchical, parsed-based dialog models, to yield parsed spoken dialogs, wherein the spoken dialogs are annotated to indicate dialog acts, feature vectors, and task/subtask information;
  
  constructing a functional task structure of the parsed spoken dialogs, wherein the functional task structure does not comprise a rhetorical structure of the parsed spoken dialogs;
  
  predicting a likely next dialog act using the functional task structure, the feature vectors, and the hierarchical, parsed-based dialog model; and
  
  selecting a language model for a next utterance based on the likely next dialog act.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein the shift-reduce model has a stack and a tree which (a) shifts each utterance onto the stack, (b) inspects the stack, and (c) based on the stack inspection, performs a reduce action that creates subtrees in the tree.
  - 10. The system of claim 8, wherein the start-complete model uses a stack to maintain a global parse state and produces a dialog task structure directly without producing an equivalent tree.
  - 11. The system of claim 8, wherein the connection path model does not use a stack to maintain a global parse state, and wherein the connection path model (a) directly predicts a connection path from a root to a terminal for each received spoken dialog, and (b) creates a parse tree representing the connection path for each received spoken dialog.
  - 12. The system of claim 8, the non-transitory computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising:
    - incrementally receiving user utterances as a dialog progresses;
      
      assigning a dialog act to a current user utterance based on the functional task structure of the parsed spoken dialogs;
      
      assigning a subtask label to the current user utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a system subtask label for a next system utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a system dialog act for a next system utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a next subtask label for a next user utterance based on the functional task structure of the parsed spoken dialogs; and
      
      predicting a next dialog act for a next user utterance based on the functional task structure of the parsed spoken dialogs.
  - 13. The system of claim 12, the non-transitory computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform further operations comprising modeling predictions as maximum entropy classifiers which select dialog acts or subtask labels from a pre-selected list.
  - 14. The system of claim 12, the non-transitory computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform further operations comprising measuring dialog efficiency at different dialog stages.

15. A computer-readable device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- training a plurality of hierarchical, parsed-based dialog models comprising a shift-reduce model, a start-complete model, and a connection path model, wherein the plurality of hierarchical, parsed-based dialog models operate incrementally from left to right and only analyze an immediately preceding dialog context;
  
  parsing, via a processor, spoken dialogs with a hierarchical, parse-based dialog model from the plurality of hierarchical, parsed-based dialog models, to yield parsed spoken dialogs, wherein the spoken dialogs are annotated to indicate dialog acts, feature vectors, and task/subtask information;
  
  constructing a functional task structure of the parsed spoken dialogs, wherein the functional task structure does not comprise a rhetorical structure of the parsed spoken dialogs;
  
  predicting a likely next dialog act using the functional task structure, the feature vectors, and the hierarchical, parsed-based dialog model; and
  
  selecting a language model for a next utterance based on the likely next dialog act.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The computer-readable device of claim 15, wherein the shift-reduce model has a stack and a tree which (a) shifts each utterance onto the stack, (b) inspects the stack, and (c) based on the stack inspection, performs a reduce action that creates subtrees in the tree.
  - 17. The computer-readable device of claim 15, wherein the start-complete model uses a stack to maintain a global parse state and produces a dialog task structure directly without producing an equivalent tree.
  - 18. The computer-readable device of claim 15, wherein the connection path model does not use a stack to maintain a global parse state, and wherein the connection path model (a) directly predicts a connection path from a root to a terminal for each received spoken dialog, and (b) creates a parse tree representing the connection path for each received spoken dialog.
  - 19. The computer-readable device of claim 15, having additional instructions stored which, when executed by the processor, cause the processor to perform further operations comprising:
    - incrementally receiving user utterances as a dialog progresses;
      
      assigning a dialog act to a current user utterance based on the functional task structure of the parsed spoken dialogs;
      
      assigning a subtask label to the current user utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a system subtask label for a next system utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a system dialog act for a next system utterance based on the functional task structure of the parsed spoken dialogs;
      
      predicting a next subtask label for a next user utterance based on the functional task structure of the parsed spoken dialogs; and
      
      predicting a next dialog act for a next user utterance based on the functional task structure of the parsed spoken dialogs.
  - 20. The computer-readable device of claim 19, having additional instructions stored which, when executed by the processor, cause the processor to perform further operations comprising measuring dialog efficiency at different dialog stages.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Stent, Amanda, Bangalore, Srinivas
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
KOVACEK, DAVID M

Application Number

US12/324,340
Publication Number

US 20100131274A1
Time in Patent Office

2,477 Days
Field of Search

704 1- 10, 704/200, 704231-236, 704250-257, 704270-275, 704E11001-E11007, 379/52, 379 671- 8828, 706 12- 14, 706 45- 61, 707600-606, 707790-812
US Class Current

1/1
CPC Class Codes

G06F 40/12   Use of codes for handling t...

G06F 40/137   Hierarchical processing, e....

G06F 40/154   Tree transformation for tre...

G10L 15/005   Language recognition

G10L 15/04   Segmentation; Word boundary...

G10L 15/063   Training

G10L 15/08   Speech classification or se...

G10L 15/18   using natural language mode...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 2015/0638   Interactive procedures

G10L 25/12   the extracted parameters be...

System and method for dialog modeling

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

46 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

System and method for dialog modeling

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

46 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others