Integrated speech recognition and semantic classification

US 7,856,351 B2
Filed: 01/19/2007
Issued: 12/21/2010
Est. Priority Date: 01/19/2007
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method performed by a computer with a processor, of training model parameters, the method comprising:

identifying a target word sequence from among a set of word sequences, such that the target word sequence has a highest joint association score with a target semantic class, wherein the joint association score is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal, wherein the joint association score incorporates one or more parameters that are applied to one or more features of the word sequence for signal-to-class modeling of the acoustic signal, the one or more parameters including parameters applied to one or more features to match the acoustic signal to the word sequence and parameters applied to one or more features of the word sequence to match the word sequence to a semantic class;

identifying, with the processor, a competitor word sequence from among the set of word sequences other than the target word sequence, such that the competitor word sequence has a highest remaining joint association score with any available semantic class other than the target semantic class; and

revising, with the processor, one or more of the parameters to raise the joint association score of the target word sequence with the target semantic class relative to the joint association score of the competitor word sequence with the target semantic class.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A novel system integrates speech recognition and semantic classification, so that acoustic scores in a speech recognizer that accepts spoken utterances may be taken into account when training both language models and semantic classification models. For example, a joint association score may be defined that is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal. The joint association score may incorporate parameters such as weighting parameters for signal-to-class modeling of the acoustic signal, language model parameters and scores, and acoustic model parameters and scores. The parameters may be revised to raise the joint association score of a target word sequence with a target semantic class relative to the joint association score of a competitor word sequence with the target semantic class. The parameters may be designed so that the semantic classification errors in the training data are minimized.

Citations

20 Claims

1. A computer-implemented method performed by a computer with a processor, of training model parameters, the method comprising:
- identifying a target word sequence from among a set of word sequences, such that the target word sequence has a highest joint association score with a target semantic class, wherein the joint association score is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal, wherein the joint association score incorporates one or more parameters that are applied to one or more features of the word sequence for signal-to-class modeling of the acoustic signal, the one or more parameters including parameters applied to one or more features to match the acoustic signal to the word sequence and parameters applied to one or more features of the word sequence to match the word sequence to a semantic class;
  
  identifying, with the processor, a competitor word sequence from among the set of word sequences other than the target word sequence, such that the competitor word sequence has a highest remaining joint association score with any available semantic class other than the target semantic class; and
  
  revising, with the processor, one or more of the parameters to raise the joint association score of the target word sequence with the target semantic class relative to the joint association score of the competitor word sequence with the target semantic class.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. The method of claim 1, wherein revising the parameters comprises increasing one or more of the parameters that apply to signal-to-class modeling features of the target word sequence and not to signal-to-class modeling features of the competitor word sequence.
  - 3. The method of claim 1, wherein revising the parameters comprises decreasing one or more of the parameters that apply to signal-to-class modeling features of the competitor word sequence and not to signal-to-class modeling features of the target word sequence.
  - 4. The method of claim 1, wherein the signal-to-class modeling comprises an automatic speech recognition language model matching an acoustic signal with a word sequence, and the parameters are applied to lexical features of the word sequence.
  - 5. The method of claim 4, wherein the lexical features comprise n-grams in the word sequence.
  - 6. The method of claim 4, wherein the automatic speech recognition language model matches an acoustic signal with a word sequence using speech lattices.
  - 7. The method of claim 6, wherein the automatic speech recognition language model matches an acoustic signal with a word sequence by summing up acoustic scores of all paths of the speech lattices that correspond to possible word sequences for the acoustic signal.
  - 8. The method of claim 1, wherein the signal-to-class modeling comprises a semantic classification model matching a word sequence with a semantic class, and the parameters are applied to semantic features of a semantic class.
  - 9. The method of claim 8, wherein the semantic features correspond to lexical features of the word sequence.
  - 10. The method of claim 1, further comprising:
    - iteratively identifying additional competitor word sequences from among the remaining word sequences other than the target word sequence and previously identified competitor word sequences, such that each of the competitor word sequences has a highest remaining joint association score with any remaining semantic class that has not previously been used for a highest joint association score; and
      
      increasing one or more of the parameters that apply to signal-to-class modeling features of the target word sequence and not to signal-to-class modeling features of any of the competitor word sequences.
  - 11. The method of claim 1, further comprising:
    - identifying one or more additional target word sequences from among the set of word sequences, the additional target word sequences having highest joint association scores with one or more additional target semantic classes, for one or more additional acoustic signals; and
      
      for each of the target semantic classes, revising weighting factors to increase the joint association scores of the target word sequences with the target semantic classes.
  - 12. The method of claim 11, further comprising:
    - assigning a total loss function that parameterizes the joint association scores of the target word sequences and the competitor word sequences with the target semantic classes;
      
      using the total loss function as a weighting parameter applied to a component of the joint classification score that indicates a probability of a word sequence corresponding to an acoustic signal; and
      
      iteratively adjusting the weighting parameters applied to one or more features of the word sequence for signal-to-class modeling of the acoustic signal to reduce the total loss function.
  - 13. The method of claim 1, further comprising providing a system that automatically semantically classifies spoken language using an automatic speech recognition language model that incorporates one or more of the revised parameters.
  - 14. The method of claim 1, further comprising providing a system automatically semantically classifies spoken language using a semantic classification model that incorporates one or more of the revised parameters.
  - 15. The method of claim 1, wherein the joint association scores of the word sequences with the target semantic class are modeled using a maximum entropy model.
  - 16. The method of claim 1, wherein updating one of the parameter values comprises using an update equation that is formed by using minimum classification error training.
  - 17. The method of claim 1, wherein updating one of the parameter values comprises using an update equation that is formed by using a steepest descent optimization technique.
  - 18. The method of claim 1, wherein the acoustic signal comprises a spoken utterance.

19. A medium comprising instructions that are readable and executable by a computing system, with a processor, wherein the instructions cause the computing system to train an integrated automatic speech recognition and semantic classification system that recognizes and semantically classifies acoustic signals, comprising causing the computing system to:
- use a speech lattice to determine, with the processor, language model parameters for matching a set of acoustic signals to a set of word sequences;
  
  use a maximum entropy model to determine, with the processor, semantic classification model parameters for matching the set of word sequences to a set of semantic classes;
  
  evaluate a classification decision rule, with the processor, that matches acoustic signals to corresponding semantic classes, incorporating the language model parameters and the semantic classification model parameters;
  
  determine, with the processor, a total classification loss function indicative of a rate of error in matching the acoustic signals to word sequences to semantic classes;
  
  weight the language model parameters in the classification decision rule to account for the total classification loss function;
  
  iteratively update, with the processor, at least one of the language model parameters and the semantic classification model parameters to reduce the total classification loss function that incorporates errors in semantically classifying acoustic signals due to the language model and the semantic classification model; and
  
  provide the classification decision rule incorporating the at least one of the iteratively updated language model parameters and the semantic classification model parameters, for the integrated automatic speech recognition and semantic classification system.

20. A medium comprising instructions that are readable and executable by a computing system with a processor, wherein the instructions cause the computing system to perform an integrated process of automatic speech recognition and automatic speech semantic classification on speech inputs, comprising causing the computing system to:
- receive a speech input;
  
  match, with the processor, the speech input to a semantic class and to a word sequence corresponding to the semantic class by applying a semantic classification rule to the speech input, wherein the semantic classification rule matches the speech input to the semantic class and the word sequence that has the highest joint association score between the semantic class and the word sequence for the speech input; and
  
  provide, with the processor, the semantic classes matched to the speech input to an application that provides user output that is dependent on the semantic classes matched to the speech input;
  
  wherein the joint association score comprises parameters of a language model and parameters of a semantic classification model that have been iteratively trained to reduce a total loss function indicative of a rate of error in semantically classifying speech inputs due to errors in both the language model and the semantic classification model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Yaman, Sibel, Deng, Li, Yu, Dong, Acero, Alejandro, Wang, Ye-Yi
Primary Examiner(s)
Chawan; Vijay B

Application Number

US11/655,703
Publication Number

US 20080177547A1
Time in Patent Office

1,432 Days
Field of Search

704/9, 704/257, 704/260, 704/255, 704/256, 704/4, 704/2, 704/232, 704/251, 704/258, 704/240, 704/231
US Class Current

704/9
CPC Class Codes

G10L 15/1815 Semantic context, e.g. disa...

Integrated speech recognition and semantic classification

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Integrated speech recognition and semantic classification

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links