Integrated speech recognition and semantic classification
First Claim
1. A computer-implemented method performed by a computer with a processor, of training model parameters, the method comprising:
- identifying a target word sequence from among a set of word sequences, such that the target word sequence has a highest joint association score with a target semantic class, wherein the joint association score is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal, wherein the joint association score incorporates one or more parameters that are applied to one or more features of the word sequence for signal-to-class modeling of the acoustic signal, the one or more parameters including parameters applied to one or more features to match the acoustic signal to the word sequence and parameters applied to one or more features of the word sequence to match the word sequence to a semantic class;
identifying, with the processor, a competitor word sequence from among the set of word sequences other than the target word sequence, such that the competitor word sequence has a highest remaining joint association score with any available semantic class other than the target semantic class; and
revising, with the processor, one or more of the parameters to raise the joint association score of the target word sequence with the target semantic class relative to the joint association score of the competitor word sequence with the target semantic class.
2 Assignments
0 Petitions
Accused Products
Abstract
A novel system integrates speech recognition and semantic classification, so that acoustic scores in a speech recognizer that accepts spoken utterances may be taken into account when training both language models and semantic classification models. For example, a joint association score may be defined that is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal. The joint association score may incorporate parameters such as weighting parameters for signal-to-class modeling of the acoustic signal, language model parameters and scores, and acoustic model parameters and scores. The parameters may be revised to raise the joint association score of a target word sequence with a target semantic class relative to the joint association score of a competitor word sequence with the target semantic class. The parameters may be designed so that the semantic classification errors in the training data are minimized.
-
Citations
20 Claims
-
1. A computer-implemented method performed by a computer with a processor, of training model parameters, the method comprising:
-
identifying a target word sequence from among a set of word sequences, such that the target word sequence has a highest joint association score with a target semantic class, wherein the joint association score is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal, wherein the joint association score incorporates one or more parameters that are applied to one or more features of the word sequence for signal-to-class modeling of the acoustic signal, the one or more parameters including parameters applied to one or more features to match the acoustic signal to the word sequence and parameters applied to one or more features of the word sequence to match the word sequence to a semantic class; identifying, with the processor, a competitor word sequence from among the set of word sequences other than the target word sequence, such that the competitor word sequence has a highest remaining joint association score with any available semantic class other than the target semantic class; and revising, with the processor, one or more of the parameters to raise the joint association score of the target word sequence with the target semantic class relative to the joint association score of the competitor word sequence with the target semantic class. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A medium comprising instructions that are readable and executable by a computing system, with a processor, wherein the instructions cause the computing system to train an integrated automatic speech recognition and semantic classification system that recognizes and semantically classifies acoustic signals, comprising causing the computing system to:
-
use a speech lattice to determine, with the processor, language model parameters for matching a set of acoustic signals to a set of word sequences; use a maximum entropy model to determine, with the processor, semantic classification model parameters for matching the set of word sequences to a set of semantic classes; evaluate a classification decision rule, with the processor, that matches acoustic signals to corresponding semantic classes, incorporating the language model parameters and the semantic classification model parameters; determine, with the processor, a total classification loss function indicative of a rate of error in matching the acoustic signals to word sequences to semantic classes; weight the language model parameters in the classification decision rule to account for the total classification loss function; iteratively update, with the processor, at least one of the language model parameters and the semantic classification model parameters to reduce the total classification loss function that incorporates errors in semantically classifying acoustic signals due to the language model and the semantic classification model; and provide the classification decision rule incorporating the at least one of the iteratively updated language model parameters and the semantic classification model parameters, for the integrated automatic speech recognition and semantic classification system.
-
-
20. A medium comprising instructions that are readable and executable by a computing system with a processor, wherein the instructions cause the computing system to perform an integrated process of automatic speech recognition and automatic speech semantic classification on speech inputs, comprising causing the computing system to:
-
receive a speech input; match, with the processor, the speech input to a semantic class and to a word sequence corresponding to the semantic class by applying a semantic classification rule to the speech input, wherein the semantic classification rule matches the speech input to the semantic class and the word sequence that has the highest joint association score between the semantic class and the word sequence for the speech input; and provide, with the processor, the semantic classes matched to the speech input to an application that provides user output that is dependent on the semantic classes matched to the speech input; wherein the joint association score comprises parameters of a language model and parameters of a semantic classification model that have been iteratively trained to reduce a total loss function indicative of a rate of error in semantically classifying speech inputs due to errors in both the language model and the semantic classification model.
-
Specification