System and method for automated testing of complicated dialog systems

US 8,296,144 B2
Filed: 06/04/2008
Issued: 10/23/2012
Est. Priority Date: 06/04/2008
Status: Active Grant

First Claim

Patent Images

1. A method of predicting user satisfaction of a dialog system, comprising:

defining an understanding ability measure of a set of measures, corresponding to the dialog system understanding of a user input compared to the user understanding;

defining an efficiency measure of the set of measures, corresponding to the number of dialog turns required to perform an action defined by a dialog between the user and the dialog system;

defining an action appropriateness measure of the set of measures, corresponding to an appropriateness of one or more responses of the dialog system during each dialog turn in the dialog;

applying the set of measures on a test dialog corpus generated between the dialog system and a group of human users;

assigning weights to each measure of the set of measures to generate weighted measures, wherein the weight values are based on a defined regression model which is generated by a validation of the set of measures using user satisfaction scores obtained through user satisfaction surveys for the test dialog corpus;

combining the weighted measures in a defined combinatorial equation to compute a user satisfaction score;

building a simulated user that maintains a list of goals and agenda items to complete the goals by generating a simulated dialog corpus trained from the human-user generated test dialog corpus;

applying the regression model to the simulated dialog corpus to generate an evaluation set of measures; and

using the evaluation set of measures to validate the user satisfaction score.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Embodiments of an automated dialog system testing method and component are described. This automated testing method and system supplements real human-based testing with simulated user input and incorporates a set of evaluation measures that focus on three basic aspects of task-oriented dialog systems, namely, understanding ability, efficiency, and the appropriateness of system actions. These measures are first applied on a corpus generated between a dialog system and a group of human users to demonstrate the validity of these measures with the human users'"'"' satisfaction levels. Results generally show that these measures are significantly correlated with these satisfaction levels. A regression model is then built to predict the user satisfaction scores using these evaluation measures. The regression model is applied on a simulated dialog corpus trained from the above real user corpus, and show that the user satisfaction score estimated from the simulated dialogs do not significantly differ from the real users'"'"' satisfaction scores. These evaluation measures can then be used to assess the system performance based on the estimated user satisfaction.

Citations

10 Claims

1. A method of predicting user satisfaction of a dialog system, comprising:
- defining an understanding ability measure of a set of measures, corresponding to the dialog system understanding of a user input compared to the user understanding;
  
  defining an efficiency measure of the set of measures, corresponding to the number of dialog turns required to perform an action defined by a dialog between the user and the dialog system;
  
  defining an action appropriateness measure of the set of measures, corresponding to an appropriateness of one or more responses of the dialog system during each dialog turn in the dialog;
  
  applying the set of measures on a test dialog corpus generated between the dialog system and a group of human users;
  
  assigning weights to each measure of the set of measures to generate weighted measures, wherein the weight values are based on a defined regression model which is generated by a validation of the set of measures using user satisfaction scores obtained through user satisfaction surveys for the test dialog corpus;
  
  combining the weighted measures in a defined combinatorial equation to compute a user satisfaction score;
  
  building a simulated user that maintains a list of goals and agenda items to complete the goals by generating a simulated dialog corpus trained from the human-user generated test dialog corpus;
  
  applying the regression model to the simulated dialog corpus to generate an evaluation set of measures; and
  
  using the evaluation set of measures to validate the user satisfaction score.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein the weights and defined combinatorial equation based on a specific dialog application.
  - 3. The method of claim 2 further comprising:
    - receiving the user input as spoken utterances in a spoken language unit of the dialog system; and
      
      generating semantic representations of the user input in a dialog manager coupled to the spoken language unit.
  - 4. The method of claim 3 wherein the semantic representations include a plurality of defined slots with a structure for different aspects of the spoken utterances.
  - 5. The method of claim 4 wherein the set of measures comprise one or more comparisons of the semantic representations based on the dialog system understanding of the user input, and the user understanding of a dialog system response to the user input.
  - 6. The method of claim 5 wherein the understanding ability measure compares values of constraints specified by the user and corresponding values as understood by the dialog system.
  - 7. The method of claim 6 wherein the understanding ability measure is calculated by averaging a percent agreement after each dialog turn of the dialog.
  - 8. The method of claim 5 wherein the efficiency measure comprises a ratio between the number of turns in a perfect understanding case and the number of actual turns for the dialog.
  - 9. The method of claim 5 wherein the action appropriateness measure is based on a predefinition of inappropriate responses programmed into the dialog system.
  - 10. The method of claim 5 wherein the action appropriateness measure is based on one or more responses by the dialog system to misunderstood user requirements.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Robert Bosch GmbH
Original Assignee
Robert Bosch GmbH
Inventors
Weng, Fuliang, Ai, Hua
Primary Examiner(s)
Saint Cyr, Leonard

Application Number

US12/133,123
Publication Number

US 20090306995A1
Time in Patent Office

1,602 Days
Field of Search

None
US Class Current

704/270
CPC Class Codes

G06Q 10/103 Workflow collaboration or p...

G10L 15/01 Assessment or evaluation of...

System and method for automated testing of complicated dialog systems

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for automated testing of complicated dialog systems

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links