Spoken utterance classification training for a speech recognition system

US 9,082,403 B2
Filed: 12/15/2011
Issued: 07/14/2015
Est. Priority Date: 12/15/2011
Status: Active Grant

First Claim

Patent Images

1. In a computing environment, a method performed at least in part on at least one processor, comprising:

receiving, by a speech recognition system, spoken utterances and associated confirmations;

processing, by a classifier of the speech recognition system, the spoken utterances and associated confirmations from output data associated with the speech recognition system, including for each spoken utterance having a denied confirmation, assigning a pseudo-semantic label that is a representation of an association between the denied confirmation and a rejected semantic label selected from a set of potential semantic labels, and updating a classification model associated with the classifier using each assigned pseudo-semantic label, wherein the denied confirmation comprises determining a negative response to a confirmation prompt delivered by the speech recognition system.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.

41 Citations

View as Search Results

20 Claims

1. In a computing environment, a method performed at least in part on at least one processor, comprising:
- receiving, by a speech recognition system, spoken utterances and associated confirmations;
  
  processing, by a classifier of the speech recognition system, the spoken utterances and associated confirmations from output data associated with the speech recognition system, including for each spoken utterance having a denied confirmation, assigning a pseudo-semantic label that is a representation of an association between the denied confirmation and a rejected semantic label selected from a set of potential semantic labels, and updating a classification model associated with the classifier using each assigned pseudo-semantic label, wherein the denied confirmation comprises determining a negative response to a confirmation prompt delivered by the speech recognition system.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein assigning the pseudo-semantic label further comprises partitioning the spoken utterances into groupings of spoken utterances having unsettled denied confirmations and a current training set comprising settled spoken utterances, wherein the settled spoken utterances comprise accepted confirmations.
  - 3. The method of claim 2, wherein updating the classification model further comprises for each grouping, determining pseudo-semantic labels for the spoken utterances having the unsettled denied confirmations and training the classification model with a combination of the current training set and the spoken utterances having the unsettled denied confirmations.
  - 4. The method of claim 2, wherein updating the classification model further comprises for each grouping, determining a number of discrepancies between a semantic labeling produced by the classifier and the combination of the current training set and the spoken utterances having the unsettled denied confirmations.
  - 5. The method of claim 4 further comprising:
    - selecting a grouping having a smallest number of discrepancies and incorporating the spoken utterances having the unsettled denied confirmations into the current training set.
  - 6. The method of claim 5 further comprising:
    - training the classification model with the current training set; and
      
      producing classifications results for the spoken utterances having the denied confirmations.
  - 7. The method of claim 6 further comprising:
    - updating the pseudo-semantic labels based on the classification results; and
      
      selecting another grouping having a smallest number of discrepancies to be incorporated into the current training set.
  - 8. The method of claim 1, wherein updating the classification model further comprises training the classification model with a current training set comprising spoken utterances having accepted confirmations and spoken utterances having settled denied confirmations.
  - 9. The method of claim 1 further comprising:
    - performing the step of assigning the pseudo-semantic label and the step of updating the classification model for another training iteration.
  - 10. The method of claim 1 further comprising:
    - repeating the method if a pseudo-semantic label assignment is modified.

11. In a computing environment, a system, comprising:
- at least one processor;
  
  a memory coupled to the at least one processor; and
  
  a speech comprehension component implemented on the at least one processor and configured to receive spoken utterances, process the spoken utterances using a classifier, generate output data comprising settled semantic labels for at least a portion of the spoken utterances and one or more confirmations associated with one or more of the spoken utterances, and assign pseudo-semantic labels for at least another portion of the spoken utterances having denied confirmations using the output data generated, the pseudo-semantic labels comprising representations of an association between the denied confirmations and one or more rejected semantic labels, the speech comprehension component further configured to train a classification model associated with the classifier with the pseudo-semantic labels and the settled semantic labels, and to use the classification model to navigate a speaking entity to desired information.
- View Dependent Claims (12, 13, 14, 15, 16, 17)
- - 12. The system of claim 11, wherein the speech comprehension component is further configured to automatically generate a pseudo-semantic label for each spoken utterance having a denied confirmation, the pseudo-semantic label being a concept consistent with the denied confirmation, satisfying the classification model, and selected from a set of potential semantic labels, the set of potential semantic labels comprising potential concepts.
  - 13. The system of claim 11, wherein the speech comprehension component is further configured to identify groupings of spoken utterances having unsettled denied confirmations and to generate current training data comprising spoken utterances having accepted confirmations and spoken utterances having settled denied confirmations.
  - 14. The system of claim 13, wherein the speech comprehension component is further configured to determine pseudo-semantic labels for each grouping of spoken utterances having unsettled denied confirmations and to train the classification model with a combination of the current training data and the each grouping.
  - 15. The system of claim 14, wherein the speech comprehension component is further configured to determine a number of discrepancies between updated semantic labels produced by the classifier and a combination of the current training data and at least one of the groupings of the spoken utterances having unsettled denied confirmations.
  - 16. The system of claim 13, wherein the speech comprehension component is further configured to incorporate a grouping having a smallest number of discrepancies between updated semantic labels produced by the classifier and a combination of the current training data and the each grouping into the current training data, to train the classifier with the current training data and to classify the output data used to produce another set of semantic labels for the spoken utterances.
  - 17. The system of claim 16, wherein the speech comprehension component is further configured to select a remaining one of the grouping to be incorporated into the current training data based on discrepancy indicia.

18. A method comprising:
- receiving, by at least one processing device, a log file comprising data associated with one or more spoken utterances, system prompt information, and denied confirmations associated with a sub-set of the one or more spoken utterances;
  
  calculating pseudo-semantic labels for each spoken utterance in the sub-set of the one or more spoken utterances, the pseudo-semantic labels comprising a classification result that is consistent with both the data and the denied confirmations, is selected from a set of potential concepts germane to the data and the system prompt information, and represents an association between the denied confirmation and one or more rejected semantic labels;
  
  training a classifier using the calculated pseudo-semantic labels; and
  
  using the trained classifier to process input speech data.
- View Dependent Claims (19, 20)
- - 19. The method of claim 18 further comprising:
    - generating training data comprising spoken utterances having accepted confirmations and spoken utterances having settled denied confirmations; and
      
      training a classification model associated with the classifier with the training data.
  - 20. The method of claim 19 further comprising:
    - identifying a grouping of spoken utterances having unsettled denied confirmations from the log file;
      
      calculating pseudo-semantic labels for the spoken utterances in the grouping; and
      
      training the classification model with a combination of the training data and the calculated pseudo-semantic labels for the spoken utterances in the grouping.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Ju, Yun-Cheng, Droppo, James Garnet III
Primary Examiner(s)
Godbold, Douglas

Application Number

US13/326,659
Publication Number

US 20130159000A1
Time in Patent Office

1,307 Days
Field of Search

704243-245
US Class Current

1/1
CPC Class Codes

G10L 15/1822 Parsing for meaning underst...

Spoken utterance classification training for a speech recognition system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

41 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Spoken utterance classification training for a speech recognition system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

41 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links