Tuning reusable software components in a speech application

US 8,386,248 B2
Filed: 09/22/2006
Issued: 02/26/2013
Est. Priority Date: 09/22/2006
Status: Active Grant

First Claim

Patent Images

1. A method of tuning reusable dialog components within a speech application comprising:

detecting speech recognition events generated from a plurality of speech recognitions, the plurality of speech recognitions performed by a speech recognition engine for a reusable dialog component that does not include any speech recognition engine, the reusable dialog component including a field and a confidence threshold that is associated with the field and specifies a minimally acceptable confidence score for any recognition result provided for the field, the field corresponding to a piece of information for which the speech application is configured to prompt a user, the speech recognition events being generated over a plurality of interactive voice response sessions;

re-prompting the user for the piece of information if a confidence score associated with a recognition result is below the confidence threshold, the recognition result being generated from a speech recognition performed by the speech recognition engine on a user utterance uttered in response to the speech application prompting the user for the piece of information, the confidence score being generated by the speech recognition engine, being associated with a speech recognition event generated from the speech recognition performed on the user utterance and indicating a confidence in an accuracy of the recognition result; and

automatically computing a suggested value for the confidence threshold by applying a statistical processing technique to confidence scores associated with a plurality of the speech recognition events, wherein a majority of the confidence scores for the plurality of the speech recognition events fall within a range having a low value and a high value, and wherein automatically computing the suggested value comprises computing a suggested value that is substantially equal to the low value for the range.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of tuning reusable dialog components within a speech application can include detecting speech recognition events generated from a plurality of recognitions performed for a field of a reusable dialog component. The speech recognition events can be generated over a plurality of interactive voice response sessions. The method also can include automatically computing a suggested value for a tuning parameter corresponding to the field of the reusable dialog component according, at least in part, to the speech recognition events.

43 Citations

View as Search Results

18 Claims

1. A method of tuning reusable dialog components within a speech application comprising:
- detecting speech recognition events generated from a plurality of speech recognitions, the plurality of speech recognitions performed by a speech recognition engine for a reusable dialog component that does not include any speech recognition engine, the reusable dialog component including a field and a confidence threshold that is associated with the field and specifies a minimally acceptable confidence score for any recognition result provided for the field, the field corresponding to a piece of information for which the speech application is configured to prompt a user, the speech recognition events being generated over a plurality of interactive voice response sessions;
  
  re-prompting the user for the piece of information if a confidence score associated with a recognition result is below the confidence threshold, the recognition result being generated from a speech recognition performed by the speech recognition engine on a user utterance uttered in response to the speech application prompting the user for the piece of information, the confidence score being generated by the speech recognition engine, being associated with a speech recognition event generated from the speech recognition performed on the user utterance and indicating a confidence in an accuracy of the recognition result; and
  
  automatically computing a suggested value for the confidence threshold by applying a statistical processing technique to confidence scores associated with a plurality of the speech recognition events, wherein a majority of the confidence scores for the plurality of the speech recognition events fall within a range having a low value and a high value, and wherein automatically computing the suggested value comprises computing a suggested value that is substantially equal to the low value for the range.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein applying the statistical processing technique comprises determining an average of the confidence scores.
  - 3. The method of claim 1, further comprising selecting a particular speech recognition event type to be detected, wherein only speech recognition events of the particular type are used to determine the suggested value.
  - 4. The method of claim 1, further comprising:
    - storing the suggested value for the confidence threshold within a model; and
      
      upon a further execution of the reusable dialog component, the confidence threshold using the suggested value from the model.
  - 5. The method of claim 1, further comprising:
    - storing the suggested value for the confidence threshold within a model; and
      
      providing an interface that facilitates real-time access to the model, wherein the interface facilitates observance of the confidence threshold as the confidence threshold is dynamically updated as operation of the interactive voice response system continues.
  - 6. The method of claim 1, further comprising computing a plurality of suggested values for the confidence threshold as operation of the interactive voice response system continues and storing the plurality of suggested values.
  - 7. The method of claim 1, further comprising counting each instantiation of the reusable dialog component over the plurality of interactive voice response sessions.
  - 8. The method of claim 1, wherein each of the plurality of speech recognitions is performed by at least one speech recognition engine, and wherein the reusable dialog component is a separate component from the at least one speech recognition engine.

9. A system for tuning reusable dialog components within a speech application comprising:
- at least one hardware processor that executes;
  
  at least one reusable dialog component that includes a field and a confidence threshold that is associated with the field and specifies a minimally acceptable confidence score for any recognition result provided for the field, the field corresponding to a piece of information for which the speech application is configured to prompt a user;
  
  a listener configured to detect speech recognition events generated during execution of the reusable dialog component, wherein the speech recognition events have a specific type and are associated with the field of the reusable dialog component, and configured to calculate a suggested value for the confidence threshold by applying a statistical processing technique to confidence scores associated with a plurality of the speech recognition events, wherein each of the plurality of speech recognition events comprises a recognition result generated by a speech recognition engine and a confidence score indicating a confidence in an accuracy of the recognition result, wherein a majority of the confidence scores for the plurality of the speech recognition events fall within a range having a low value and a high value, and wherein the listener is configured to calculate the suggested value by computing a suggested value that is substantially equal to the low value for the range; and
  
  a model configured to store the suggested value for the confidence threshold,wherein the speech application is configured to re-prompt the user for the piece of information if a confidence score associated with a recognition result is below the confidence threshold, the recognition result being provided by a speech recognition performed on a user utterance uttered in response to the speech application prompting the user for the piece of information, the confidence score being associated with a speech recognition event generated from the speech recognition performed on the user utterance.
- View Dependent Claims (10, 11)
- - 10. The system of claim 9, wherein the listener is configured to calculate the suggested value for the confidence threshold using only the speech recognition events that are of a specified type.
  - 11. The system of claim 9, wherein the reusable dialog component, upon execution, initializes the confidence threshold using the suggested value stored in the model.

12. A tangible computer-readable medium, having stored thereon a computer program having a plurality of code sections for tuning reusable dialog components within a speech application, the computer-readable medium comprising:
- code for detecting speech recognition events generated from a plurality of speech recognitions, the plurality of speech recognitions performed by a speech recognition engine for a reusable dialog component that does not include any speech recognition engine, the reusable dialog component including a field and a confidence threshold that is associated with the field and specifies a minimally acceptable confidence score for any recognition result provided for the field, the field corresponding to a piece of information for which the speech application is configured to prompt a user, the speech recognition events being generated over a plurality of interactive voice response sessions;
  
  code for re-prompting the user for the piece of information if a confidence score associated with a recognition result is below the confidence threshold, the recognition result being generated from a speech recognition performed by the speech recognition engine on a user utterance uttered in response to the speech application prompting the user for the piece of information, the confidence score being generated by the speech recognition engine, being associated with a speech recognition event generated from the speech recognition performed on the user utterance and indicating a confidence in an accuracy of the recognition result; and
  
  code for automatically computing a suggested value for the confidence threshold by applying a statistical processing technique to confidence scores associated with a plurality of the speech recognition events, wherein a majority of the confidence scores for the plurality of speech recognition events fall within a range having a low value and a high value, and wherein automatically computing the suggested value comprises computing a suggested value that is substantially equal to the low value for the range.
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. The tangible computer-readable medium of claim 12, wherein the code for applying the statistical processing technique further comprises code for determining an average of the confidence scores.
  - 14. The tangible computer-readable medium of claim 12, further comprising code for selecting a particular speech recognition event type to be detected, wherein only speech recognition events of the particular type are used to determine the suggested value.
  - 15. The tangible computer-readable medium of claim 12, further comprising:
    - code for storing the suggested value the confidence threshold within a model; and
      
      code for, upon a further execution of the reusable dialog component, initializing the confidence threshold using the suggested values from the model.
  - 16. The tangible computer-readable medium of claim 12, further comprising:
    - code for storing the suggested value for the confidence threshold within a model; and
      
      code for providing an interface that facilitates real-time access to the model, wherein the interface facilitates observance of the confidence threshold as the confidence threshold is dynamically updated as operation of the interactive voice response system continues.
  - 17. The tangible computer-readable medium of claim 12, further comprising code for computing a plurality of suggested values for the confidence threshold as operation of the interactive voice response system continues and for storing the plurality of suggested values.
  - 18. The tangible computer-readable medium of claim 12, further comprising code for counting each instantiation of the reusable dialog component over the plurality of interactive voice response sessions.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Dhanakshirur, Girish, Mandalia, Baiju D., Silva, Aimee
Primary Examiner(s)
YEN, ERIC L

Application Number

US11/534,320
Publication Number

US 20080077402A1
Time in Patent Office

2,349 Days
Field of Search

704/243, 704/244, 704/270, 704/275
US Class Current

704/243
CPC Class Codes

G10L 15/063   Training

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/22   Procedures used during a sp...

Tuning reusable software components in a speech application

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

43 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Tuning reusable software components in a speech application

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

43 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links