DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR TEXT AND SPEECH CLASSIFICATION
First Claim
Patent Images
1. A computer-implemented method, comprising:
- estimating a set of parameters for a plurality of n-gram language models that each correspond to a different class, wherein each class corresponds to a different category of subject matter, and wherein estimating comprises;
setting initial values for each n-gram language model'"'"'s sets of parameters; and
adjusting each n-gram language model'"'"'s sets of parameters jointly in relation to one another to increase a conditional likelihood of a class corresponding to a category of subject matter given a word string; and
utilizing the n-gram language models'"'"' sets of parameters as a basis for supporting a determination as to which of the plurality of classes represents the category of subject matter that is best correlated to a given natural language input.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.
29 Citations
17 Claims
-
1. A computer-implemented method, comprising:
-
estimating a set of parameters for a plurality of n-gram language models that each correspond to a different class, wherein each class corresponds to a different category of subject matter, and wherein estimating comprises; setting initial values for each n-gram language model'"'"'s sets of parameters; and adjusting each n-gram language model'"'"'s sets of parameters jointly in relation to one another to increase a conditional likelihood of a class corresponding to a category of subject matter given a word string; and utilizing the n-gram language models'"'"' sets of parameters as a basis for supporting a determination as to which of the plurality of classes represents the category of subject matter that is best correlated to a given natural language input. - View Dependent Claims (2)
-
-
11. A computer-implemented classification method that includes a method for estimating a set of parameters for each of a plurality of n-gram language models, each n-gram language model being associated with a different category of subject matter, the computer-implemented classification method comprising:
-
producing the sets of parameters for at least two of the n-gram language models jointly in relation to one another; utilizing the sets of parameters of the plurality of n-gram language models that are produced at least partially jointly in relation to one another as a basis for supporting a determination as to which of the categories of subject matter is best correlated to a given natural language input, wherein the category of subject matter that is best correlated to the given natural language input is information other than a textual representation of the given natural language input itself. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17)
-
Specification