CUSTOM LANGUAGE MODELS
First Claim
Patent Images
1. A method comprising:
- receiving a collection of documents;
clustering the documents into one or more clusters;
generating a cluster vector for each cluster of the one or more clusters;
generating a target vector associated with a target profile;
comparing the target vector with each of the cluster vectors;
selecting one or more of the one or more clusters based on the comparison; and
generating a language model using documents from the one or more selected clusters.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods, and apparatuses including computer program products for generating a custom language model. In one implementation, a method is provided. The method includes receiving a collection of documents; clustering the documents into one or more clusters; generating a cluster vector for each cluster of the one or more clusters; generating a target vector associcated with a target profile; comparing the target vector with each of the cluster vectors; selecting one or more of the one or more clusters based on the comparison; and generating a language model using documents from the one or more selected clusters.
-
Citations
19 Claims
-
1. A method comprising:
-
receiving a collection of documents; clustering the documents into one or more clusters; generating a cluster vector for each cluster of the one or more clusters; generating a target vector associated with a target profile; comparing the target vector with each of the cluster vectors; selecting one or more of the one or more clusters based on the comparison; and generating a language model using documents from the one or more selected clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
receiving a collection of documents; clustering the documents into one or more generic clusters; generating a cluster vector for each cluster of the one or more generic clusters; generating a target vector associated with a target profile; comparing the target vector with each of the cluster vectors; and selecting one or more of the one or more generic clusters based on the comparison. - View Dependent Claims (11)
-
-
12. A method comprising:
-
receiving a user input identifying the user; identifying a user profile corresponding to the user; using the identified profile to generate a user specific language model; and sending the user specific language model to a first client. - View Dependent Claims (13, 14, 15)
-
-
16. A method comprising:
-
receiving a first collection of one or more documents; generating a profile based on the first collection of one or more documents; receiving a second collection of one or more documents; generating a custom language model based on the second collection of one or more documents and the profile; and sending the custom language model to a client. - View Dependent Claims (17)
-
-
18. A computer program product, encoded on a tangible program carrier, operable to cause data processing apparatus to perform operations comprising:
-
receiving a collection of documents; clustering the documents into one or more clusters; generating a cluster vector for each cluster of the one or more clusters; generating a target vector associated with a target profile; comparing the target vector with each of the cluster vectors; selecting one or more of the one or more clusters based on the comparison; and generating a language model using documents from the one or more selected clusters.
-
-
19. A system, comprising:
-
a machine-readable storage device including a program product; and one or more computers operable to execute the program product and perform operations comprising; receiving a collection of documents; clustering the documents into one or more clusters; generating a cluster vector for each cluster of the one or more clusters; generating a target vector associated with a target profile; comparing the target vector with each of the cluster vectors; selecting one or more of the one or more clusters based on the comparison; and generating a language model using documents from the one or more selected clusters.
-
Specification