×

Discriminative training of language models for text and speech classification

  • US 7,379,867 B2
  • Filed: 06/03/2003
  • Issued: 05/27/2008
  • Est. Priority Date: 06/03/2003
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer implemented method of classifying a natural language input, comprising:

  • training a plurality of statistical classification components jointly in relation to one another to maximize a conditional likelihood of a class given a word string using an application of the rational growth transform, the plurality of statistical classification components being n-gram language model classifiers that each correspond to a different class, wherein each class corresponds to a different category of subject matter, and wherein training the plurality of statistical classification components comprises;

    identifying an optimal number of rational function growth transform iterations and an optimal conditional maximum likelihood (CML) weight β

    max to facilitate application of the rational function growth transform, wherein identifying comprises;

    splitting a collection of training data into a collection of main data and a collection of held-out data;

    using the main data to estimate a series of relative frequencies for the statistical classification components; and

    using the held-out data to tune the optimal number of rational function growth transform iterations and the optimal CML weight β

    max;

    receiving a natural language input;

    applying the plurality of statistical classification components to the natural language input so as to classify the natural language input into a particular one or more of the plurality of classes that represent the category or categories of subject matter that is best correlated to the natural language input;

    wherein using the held-out data to tune comprises;

    fixing a preset number N of rational function growth transform iterations to be run;

    fixing a range of values to be explored for determining the optimal CML weight β

    max;

    for each value β

    max, running as many rational function growth transform functions as possible up to N such that the conditional likelihood of the main data increases at each iteration; and

    identifying as optimal the number of rational function growth transforms iterations; and

    the β

    max value that maximizes the conditional likelihood of the held-out data; and

    wherein training the plurality of statistical classification components further comprises;

    pooling the main and held-out data to form a combined collection of training data; and

    training the plurality of statistical classification components on the combined collection of training data using the optimal number of rational function growth transform iterations and the optimal CML weight β

    max.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×