Method and system of reviewing the behavior of an interactive speech recognition application
First Claim
1. A method of reviewing the behavior of an interactive speech recognition application that plays prompts to a caller, that interprets utterance responses from the caller, and that reacts in response, the method comprising the acts of:
- causing the speech recognition application to store, in a computer-readable file, event information, including token information identifying prompts played to the caller by the speech recognition application, token information about the utterance responses from the caller, and token information about the reactions of the application to the utterance responses;
in response to a user input, using a computer program to parse the computer readable file to detect token information in the file and to display a formatted report therefrom, wherein the format of the report illustrates a sequence of events occurring during the call and includes controls allowing a user to replay an utterance response so that the user may review the behavior of the speech recognition application including reviewing the utterance response interpreted by the application and including reviewing the reactions thereto by the application.
7 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are provided for improving the performance of an interactive speech application. The interactive speech application is developed and deployed for use by one or more callers. During execution, the interactive speech application stores, in a log, event information that describes each task carried out by the interactive speech application in response to interaction with the one or more callers. The application also stores one or more sets of audio information, in which each of the sets of audio information is associated with one or more utterances by one of the callers. Each of the sets of audio information is associated with one of the tasks represented in the log. After the log is established, an analytical report is displayed. The report describes selective actions taken by the interactive speech application while executing, and selective actions taken by one or more callers while interacting with the interactive speech application. Information in the analytical report is selected so as to identify one or more potential performance problems in the interactive speech application. While the analytical report is displayed, when the analytical report reaches a point at which the audio information was previously recorded and stored, the audio information may be replayed and analyzed. The interactive speech application is modified based on the analytical report. Accordingly, the interactive speech application may be improved based upon its actual performance, and its actual performance may be evaluated in detail based on specific call events and caller responses to application actions.
260 Citations
22 Claims
-
1. A method of reviewing the behavior of an interactive speech recognition application that plays prompts to a caller, that interprets utterance responses from the caller, and that reacts in response, the method comprising the acts of:
-
causing the speech recognition application to store, in a computer-readable file, event information, including token information identifying prompts played to the caller by the speech recognition application, token information about the utterance responses from the caller, and token information about the reactions of the application to the utterance responses;
in response to a user input, using a computer program to parse the computer readable file to detect token information in the file and to display a formatted report therefrom, wherein the format of the report illustrates a sequence of events occurring during the call and includes controls allowing a user to replay an utterance response so that the user may review the behavior of the speech recognition application including reviewing the utterance response interpreted by the application and including reviewing the reactions thereto by the application. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
in response to a user selection of a call from the call listing, causing the filtering act to be performed for said selected call. -
4. The method of claim 1 wherein the speech recognition application is constructed from one or more modules of software logic and wherein the event information includes token information identifying the modules used by the speech recognition application when handling a call, and wherein the report illustrates when the modules were used in the sequence of events.
-
5. The method of claim 1 wherein the token information about the reactions of the application includes information identifying a recognition decision about the meaning of an utterance response as interpreted by the application.
-
6. The method of claim 1 wherein the execution of speech recognition applications is characterized by a call flow descriptive of the path of logic operations actually executed by the application from a plurality of such paths, and wherein the event information includes token information about call flow decisions.
-
7. The method of claim 1 wherein the interpretation of an utterance response by the speech recognition application is characterized by context role information, including information identifying whether the application is collecting an original utterance, confirmation context and disambiguation context, and wherein the event information includes tokens about the role information of the speech application.
-
8. The method of claim 1 wherein the event information includes token information identifying whether the speech recognition application is identifying a barge-in utterance response, wherein a barge-in utterance response is an utterance made by a user before a preceding prompt is completely played to the user, and wherein the report illustrates that the response was a barge-in response.
-
9. The method of claim 1 wherein the speech recognition application identifies best results when interpreting an utterance response, and wherein the event information includes token information about the N-best results and wherein the report includes information indicative of the N-best results.
-
10. The method of claim 9 wherein the token information about the N-best results includes match confidence information indicating the confidence that the interpretation is correct.
-
11. The method of claim 1 wherein the event information about the reactions to the utterance responses includes token information about an utterance decision made by the speech recognition application in reacting to the utterance, wherein an utterance decision token information identifies whether the speech application interpreted the utterance with or without high confidence, and whether the user accepted or rejected a confirmation prompt.
-
-
12. A system for reviewing the behavior of an interactive speech recognition application that plays prompts to a caller, that interprets utterance responses from the caller, and that reacts in response, comprising:
-
logic to cause the speech recognition application to store, in a computer-readable file, event information, including token information identifying prompts played to the caller by the speech recognition application, token information about the utterance responses from the caller, and token information about the reactions of the application to the utterance responses;
logic, responsive to a user input, to parse the computer-readable file to detect token information in the file and to display a formatted report therefrom, wherein the format of the report illustrates a sequence of events occurring during the call and includes controls allowing a user to replay an utterance response so that the user may review the behavior of the speech recognition application including reviewing the utterance response interpreted by the application and including reviewing the reactions thereto by the application. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
logic to display a call listing descriptive of a plurality of calls handled by the speech recognition application, wherein each of said calls has caused corresponding event information to be stored in said file, and logic, responsive to a user selection of a call from the call listing, to cause the filtering logic to filter information about the selected call. -
15. The system of claim 12 wherein the speech recognition application is constructed from one or more modules of software logic and wherein the event information includes token information identifying the modules used by the speech recognition application when handling a call, and wherein the report illustrates when the modules were used in the sequence of events.
-
16. The system of claim 12 wherein the token information about the reactions of the application includes information identifying a recognition decision about the meaning of an utterance response as interpreted by the application.
-
17. The system of claim 12 wherein the execution of speech recognition applications is characterized by a call flow descriptive of the path of logic operations actually executed by the application from a plurality of such paths, and wherein the event information includes token information about call flow decisions.
-
18. The system of claim 12 wherein the interpretation of an utterance response by the speech recognition application is characterized by context role information, including information identifying whether the application is collecting an original utterance, confirmation context and disambiguation context, and wherein the event information includes tokens about the role information of the speech application.
-
19. The system of claim 12 wherein the event information includes token information identifying whether the speech recognition application is identifying a barge-in utterance response, wherein a barge-in utterance response is an utterance made by a user before a preceding prompt is completely played to the user, and wherein the report illustrates that the response was a barge-in response.
-
20. The system of claim 12 wherein the speech recognition application identifies N-best results when interpreting an utterance response, and wherein the event information includes token information about the N-best results and wherein the report includes information indicative of the N-best results.
-
21. The system of claim 20 wherein the token information about the N-best results includes match confidence information indicating the confidence that the interpretation is correct.
-
22. The system of claim 12 wherein the event information about the reactions to the utterance responses includes token information about an utterance decision made by the speech recognition application in reacting to the utterance, wherein an utterance decision token information identifies whether the speech application interpreted the utterance with or without high confidence, and whether the user accepted or rejected a confirmation prompt.
-
Specification