Language modeling of complete language sequences
First Claim
1. A method comprising:
- accessing, by a data processing apparatus, training data indicating queries submitted by one or more users;
determining, by the data processing apparatus and for at least some of the queries, a count of a number of times the training data indicates the query was submitted;
selecting, by the data processing apparatus, a proper subset of the queries based on the counts;
training, by the data processing apparatus, a first component of a language model based on the counts, the first component including first probability data indicating relative frequencies of the selected queries among the training data;
training, by the data processing apparatus, a second component of the language model based on the training data, the second component including second probability data for assigning scores to queries that are not included in the selected queries;
determining, by the data processing apparatus, adjustment data that includes one or more weighting values for normalizing the second probability data with respect to the first probability data; and
storing, by the data processing apparatus, the first component, the second component, and the adjustment data.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling of complete language sequences. Training data indicating language sequences is accessed, and counts for a number of times each language sequence occurs in the training data are determined. A proper subset of the language sequences is selected, and a first component of a language model is trained. The first component includes first probability data for assigning scores to the selected language sequences. A second component of the language model is trained based on the training data, where the second component includes second probability data for assigning scores to language sequences that are not included in the selected language sequences. Adjustment data that normalizes the second probability data with respect to the first probability data is generated, and the first component, the second component, and the adjustment data are stored.
28 Citations
20 Claims
-
1. A method comprising:
-
accessing, by a data processing apparatus, training data indicating queries submitted by one or more users; determining, by the data processing apparatus and for at least some of the queries, a count of a number of times the training data indicates the query was submitted; selecting, by the data processing apparatus, a proper subset of the queries based on the counts; training, by the data processing apparatus, a first component of a language model based on the counts, the first component including first probability data indicating relative frequencies of the selected queries among the training data; training, by the data processing apparatus, a second component of the language model based on the training data, the second component including second probability data for assigning scores to queries that are not included in the selected queries; determining, by the data processing apparatus, adjustment data that includes one or more weighting values for normalizing the second probability data with respect to the first probability data; and storing, by the data processing apparatus, the first component, the second component, and the adjustment data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; accessing, by the one or more computers, training data indicating queries submitted by one or more users; determining, by the one or more computers and for at least some of the queries, a count of a number of times the training data indicates the query was submitted; selecting, by the one or more computers, a proper subset of the queries based on the counts; training, by the one or more computers, a first component of a language model based on the counts, the first component including first probability data indicating relative frequencies of the selected queries among the training data; training, by the one or more computers, a second component of the language model based on the training data, the second component including second probability data for assigning scores to queries that are not included in the selected queries; determining, by the one or more computers, adjustment data that includes one or more weighting values for normalizing the second probability data with respect to the first probability data; and storing, by the one or more computers, the first component, the second component, and the adjustment data.
-
20. A non-transitory computer storage medium storing a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
accessing, by the one or more computers, training data indicating queries submitted by one or more users; determining, by the one or more computers and for at least some of the queries, a count of a number of times the training data indicates the query was submitted; selecting, by the one or more computers, a proper subset of the queries based on the counts; training, by the one or more computers, a first component of a language model based on the counts, the first component including first probability data indicating relative frequencies of the selected queries among the training data; training, by the one or more computers, a second component of the language model based on the training data, the second component including second probability data for assigning scores to queries that are not included in the selected queries; determining, by the one or more computers, adjustment data that includes one or more weighting values for normalizing the second probability data with respect to the first probability data; and storing, by the one or more computers, the first component, the second component, and the adjustment data.
-
Specification