System and Method for Automated Testing of Complicated Dialog Systems
First Claim
1. A method of predicting user satisfaction of a dialog system, comprising:
- defining an understanding ability measure of a set of measures, corresponding to the dialog system understanding of a user input compared to the user understanding;
defining an efficiency measure of the set of measures, corresponding to the number of dialog turns required to perform an action defined by a dialog between the user and the dialog system;
defining an action appropriateness measure of the set of measures, corresponding to an appropriateness of one or more responses of the dialog system during each dialog turn in the dialog;
assigning weights to each measure of the set of measures; and
combining the weighted measures in a defined combinatorial equation to compute a user satisfaction score.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of an automated dialog system testing method and component are described. This automated testing method and system supplements real human-based testing with simulated user input and incorporates a set of evaluation measures that focus on three basic aspects of task-oriented dialog systems, namely, understanding ability, efficiency, and the appropriateness of system actions. These measures are first applied on a corpus generated between a dialog system and a group of human users to demonstrate the validity of these measures with the human users'"'"' satisfaction levels. Results generally show that these measures are significantly correlated with these satisfaction levels. A regression model is then built to predict the user satisfaction scores using these evaluation measures. The regression model is applied on a simulated dialog corpus trained from the above real user corpus, and show that the user satisfaction score estimated from the simulated dialogs do not significantly differ from the real users'"'"' satisfaction scores. These evaluation measures can then be used to assess the system performance based on the estimated user satisfaction.
-
Citations
21 Claims
-
1. A method of predicting user satisfaction of a dialog system, comprising:
-
defining an understanding ability measure of a set of measures, corresponding to the dialog system understanding of a user input compared to the user understanding; defining an efficiency measure of the set of measures, corresponding to the number of dialog turns required to perform an action defined by a dialog between the user and the dialog system; defining an action appropriateness measure of the set of measures, corresponding to an appropriateness of one or more responses of the dialog system during each dialog turn in the dialog; assigning weights to each measure of the set of measures; and combining the weighted measures in a defined combinatorial equation to compute a user satisfaction score. - View Dependent Claims (2, 3)
-
- 4. The method of claim 4 wherein the semantic representations include a plurality of defined slots with a structure for different aspects of the spoken utterances.
-
11. A method of testing a dialog system comprising:
-
providing simulated user input to the dialog system, the user input comprising spoken utterances consisting of a task to be performed by the dialog system; measuring a level of understanding of the dialog system of the user input; measuring an efficiency of response by the dialog system to the user input; and measuring an appropriateness of response by the dialog system to the user input. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
Specification