×

LANGUAGE MODELING OF COMPLETE LANGUAGE SEQUENCES

  • US 20140278407A1
  • Filed: 05/02/2013
  • Published: 09/18/2014
  • Est. Priority Date: 03/14/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method performed by data processing apparatus, the method comprising:

  • accessing training data indicating queries submitted by one or more users;

    determining, for each of the queries, a count of a number of times the training data indicates the query was submitted;

    selecting a proper subset of the queries based on the counts;

    training a first component of a language model based on the counts, the first component including first probability data indicating relative frequencies of the selected queries among the training data;

    training a second component of the language model based on the training data, the second component including second probability data for assigning scores to queries that are not included in the selected queries;

    determining adjustment data that normalizes the second probability data with respect to the first probability data; and

    storing the first component, the second component, and the adjustment data.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×