×

Language modeling of complete language sequences

  • US 9,786,269 B2
  • Filed: 05/02/2013
  • Issued: 10/10/2017
  • Est. Priority Date: 03/14/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • accessing, by a data processing apparatus, training data indicating queries submitted by one or more users;

    determining, by the data processing apparatus and for at least some of the queries, a count of a number of times the training data indicates the query was submitted;

    selecting, by the data processing apparatus, a proper subset of the queries based on the counts;

    training, by the data processing apparatus, a first component of a language model based on the counts, the first component including first probability data indicating relative frequencies of the selected queries among the training data;

    training, by the data processing apparatus, a second component of the language model based on the training data, the second component including second probability data for assigning scores to queries that are not included in the selected queries;

    determining, by the data processing apparatus, adjustment data that includes one or more weighting values for normalizing the second probability data with respect to the first probability data; and

    storing, by the data processing apparatus, the first component, the second component, and the adjustment data.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×