Adapting a language model to accommodate inputs not found in a directory assistance listing
First Claim
Patent Images
1. A method of generating a language model from a corpus of directory assistance listings, the language model used to estimate a desired directory assistance listing based on a listing request input, the method comprising:
- for each word in each listing in the corpus, generating, using a processing unit, language model counts attributable to the word by;
determining, based on one or more characteristics of the word in the listing, whether the word is likely to be omitted from the listing request input when the desired directory assistance listing is a listing that contains the word;
generating the language model counts attributable to the word based on whether the word is likely to be omitted from the listing request input when the desired directory assistance listing is a listing that contains the word; and
generating the language model counts attributable to the word based on a characteristic indicative of how well the word operates to distinguish the listing that contains the word from other listings in the corpus; and
storing the language model with the language model counts.
1 Assignment
0 Petitions
Accused Products
Abstract
A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.
40 Citations
18 Claims
-
1. A method of generating a language model from a corpus of directory assistance listings, the language model used to estimate a desired directory assistance listing based on a listing request input, the method comprising:
-
for each word in each listing in the corpus, generating, using a processing unit, language model counts attributable to the word by; determining, based on one or more characteristics of the word in the listing, whether the word is likely to be omitted from the listing request input when the desired directory assistance listing is a listing that contains the word; generating the language model counts attributable to the word based on whether the word is likely to be omitted from the listing request input when the desired directory assistance listing is a listing that contains the word; and generating the language model counts attributable to the word based on a characteristic indicative of how well the word operates to distinguish the listing that contains the word from other listings in the corpus; and storing the language model with the language model counts. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for generating a language model used to estimate a desired directory assistance listing based on a listing request input, the system comprising:
-
a processor; a corpus of directory assistance listings; and a language model generation component configured to; access the corpus of directory assistance listings; for each word in each listing in the corpus, determine, based on one or more characteristics of the word in the listing, whether the word is likely to be omitted from the listing request input when the desired directory assistance listing is a listing that contains the word; generate language model counts attributable to the word using the processor, wherein the language model counts are generated based on whether the word is likely to be omitted from the listing request input when the desired directory assistance listing is a listing that contains the word, and based on a characteristic indicative of how well the word operates to distinguish the listing that contains the word from other listings in the corpus; and store the language model with the language model counts. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification