System and Method for Automated Testing of Complicated Dialog Systems

US 20090306995A1
Filed: 06/04/2008
Published: 12/10/2009
Est. Priority Date: 06/04/2008
Status: Active Grant

First Claim

Patent Images

1. A method of predicting user satisfaction of a dialog system, comprising:

defining an understanding ability measure of a set of measures, corresponding to the dialog system understanding of a user input compared to the user understanding;

defining an efficiency measure of the set of measures, corresponding to the number of dialog turns required to perform an action defined by a dialog between the user and the dialog system;

defining an action appropriateness measure of the set of measures, corresponding to an appropriateness of one or more responses of the dialog system during each dialog turn in the dialog;

assigning weights to each measure of the set of measures; and

combining the weighted measures in a defined combinatorial equation to compute a user satisfaction score.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Embodiments of an automated dialog system testing method and component are described. This automated testing method and system supplements real human-based testing with simulated user input and incorporates a set of evaluation measures that focus on three basic aspects of task-oriented dialog systems, namely, understanding ability, efficiency, and the appropriateness of system actions. These measures are first applied on a corpus generated between a dialog system and a group of human users to demonstrate the validity of these measures with the human users'"'"' satisfaction levels. Results generally show that these measures are significantly correlated with these satisfaction levels. A regression model is then built to predict the user satisfaction scores using these evaluation measures. The regression model is applied on a simulated dialog corpus trained from the above real user corpus, and show that the user satisfaction score estimated from the simulated dialogs do not significantly differ from the real users'"'"' satisfaction scores. These evaluation measures can then be used to assess the system performance based on the estimated user satisfaction.

Citations

21 Claims

1. A method of predicting user satisfaction of a dialog system, comprising:
- defining an understanding ability measure of a set of measures, corresponding to the dialog system understanding of a user input compared to the user understanding;
  
  defining an efficiency measure of the set of measures, corresponding to the number of dialog turns required to perform an action defined by a dialog between the user and the dialog system;
  
  defining an action appropriateness measure of the set of measures, corresponding to an appropriateness of one or more responses of the dialog system during each dialog turn in the dialog;
  
  assigning weights to each measure of the set of measures; and
  
  combining the weighted measures in a defined combinatorial equation to compute a user satisfaction score.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1 wherein the weights and defined combinatorial equation based on a specific dialog application.
  - 3. The method of claim 2 further comprising:
    - receiving the user input as spoken utterances in a spoken language unit of the dialog system; and
      
      generating semantic representations of the user input in a dialog manager coupled to the spoken language unit.

4. The method of claim 4 wherein the semantic representations include a plurality of defined slots with a structure for different aspects of the spoken utterances.
- View Dependent Claims (5, 6, 7, 8, 9, 10)
- - 5. The method of claim 4 wherein the set of measures comprise one or more comparisons of the semantic representations based on the dialog system understanding of the user input, and the user understanding of a dialog system response to the user input.
  - 6. The method of claim 5 wherein the understanding ability measure compares values of constraints specified by the user and corresponding values as understood by the dialog system.
  - 7. The method of claim 6 wherein the understanding ability measure is calculated by averaging a percent agreement after each dialog turn of the dialog.
  - 8. The method of claim 5 wherein the efficiency measure comprises a ratio between the number of turns in a perfect understanding case and the number of actual turns for the dialog.
  - 9. The method of claim 5 wherein the action appropriateness measure is based on a predefinition of inappropriate responses programmed into the dialog system.
  - 10. The method of claim 5 wherein the action appropriateness measure is based on one or more responses by the dialog system to misunderstood user requirements.

11. A method of testing a dialog system comprising:
- providing simulated user input to the dialog system, the user input comprising spoken utterances consisting of a task to be performed by the dialog system;
  
  measuring a level of understanding of the dialog system of the user input;
  
  measuring an efficiency of response by the dialog system to the user input; and
  
  measuring an appropriateness of response by the dialog system to the user input.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
- - 12. The method of claim 11 wherein the user input is part of a dialog with the dialog system, wherein the dialog consists of a number of input/response turns between the simulated user and the dialog system.
  - 13. The method of claim 12 wherein the understanding ability measure is calculated by averaging a percent agreement after each dialog turn of the dialog.
  - 14. The method of claim 12 wherein the efficiency measure comprises a ratio between the number of turns in a perfect understanding case and the number of actual turns for the dialog.
  - 15. The method of claim 12 wherein the action appropriateness measure is based on a predefinition of inappropriate responses programmed into the dialog system.
  - 16. The method of claim 12 wherein the action appropriateness measure is based on one or more responses by the dialog system to misunderstood user requirements.
  - 17. The method of claim 12 further comprising generating semantic representations of the user input, wherein the semantic representations include a plurality of defined slots for different aspects of the spoken utterances.
  - 18. The method of claim 17 wherein the set of measures comprise one or more comparisons of the semantic representations based on the dialog system understanding of the simulated user input, and the user understanding of a dialog system response to the user input.
  - 19. The method of claim 18 wherein the semantic representations include a speech act and action from the user utterance.
  - 20. The method of claim 19 further comprising assigning values to each of the measured level of understanding, measured efficiency of response, and measured appropriateness of response;
    - andcombining the assigned values in a predefined combinatorial equation to derive a predicted user satisfaction for the simulated user input.
  - 21. The method of claim 20 further comprising deriving weights for each of the assigned values based on a defined regression model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Robert Bosch GmbH
Original Assignee
Robert Bosch GmbH
Inventors
Ai, Hua, Weng, Fuliang

Granted Patent

US 8,296,144 B2
Time in Patent Office

Days
Field of Search
US Class Current

705/301
CPC Class Codes

G06Q 10/103 Workflow collaboration or p...

G10L 15/01 Assessment or evaluation of...

System and Method for Automated Testing of Complicated Dialog Systems

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

System and Method for Automated Testing of Complicated Dialog Systems

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links