Phrase-based personalization of searches in an information retrieval system
First Claim
Patent Images
1. A method of personalizing a search of a document collection to a user, the method comprising:
- monitoring a plurality of documents accessed by a user;
identifying a plurality of first phrases present in one or more of the accessed documents;
for each of the identified first phrases, identifying one or more corresponding first related phrases, wherein the one or more first related phrases are related to the corresponding identified first phrase;
storing a user model associated with the user, and comprising a plurality of the first related phrases;
receiving a query from the user, the query including one or more second phrases;
selecting search results comprising a plurality of documents responsive to the query;
identifying, by operation of a processor configured to manipulate data within a computer system, one or more second related phrases that are related to the second phrase(s) of the query and that are present in the user model;
weighting a plurality of scores of a corresponding plurality of the search results according to the identified one or more second related phrases;
ranking the plurality of the search results for presentation to the user according to their weighted scores, to provide personalized search results; and
presenting the personalized search results to the user.
3 Assignments
0 Petitions
Accused Products
Abstract
An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. Related phrases and phrase extensions are also identified. Phrases in a query are identified and used to retrieve and rank documents. Phrases are also used to cluster documents in the search results, create document descriptions, and eliminate duplicate documents from the search results, and from the index.
192 Citations
28 Claims
-
1. A method of personalizing a search of a document collection to a user, the method comprising:
-
monitoring a plurality of documents accessed by a user; identifying a plurality of first phrases present in one or more of the accessed documents; for each of the identified first phrases, identifying one or more corresponding first related phrases, wherein the one or more first related phrases are related to the corresponding identified first phrase; storing a user model associated with the user, and comprising a plurality of the first related phrases; receiving a query from the user, the query including one or more second phrases; selecting search results comprising a plurality of documents responsive to the query; identifying, by operation of a processor configured to manipulate data within a computer system, one or more second related phrases that are related to the second phrase(s) of the query and that are present in the user model; weighting a plurality of scores of a corresponding plurality of the search results according to the identified one or more second related phrases; ranking the plurality of the search results for presentation to the user according to their weighted scores, to provide personalized search results; and presenting the personalized search results to the user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of personalizing a search of a document collection to a user, the method comprising:
-
monitoring a plurality of documents accessed by a user; identifying a plurality of first phrases present in one or more of the accessed documents; for each of the identified first phrases, identifying one or more corresponding first related phrases, wherein the one or more first related phrases are related to the corresponding identified first phrase; storing a user model associated with the user, and comprising a plurality of the first related phrases; receiving a query from the user, the query including one or more second phrases; selecting search results comprising a plurality of documents responsive to the query; identifying, by operation of a processor configured to manipulate data within a computer system, one or more second related phrases that are related to the second phrase(s) of the query and that are present in the user model, comprising; for each phrase of the query, accessing a related phrase bit vector for the phrase of the query, wherein each bit of the related phrase bit vector indicates the presence or absence of a second related phrase of the phrase of the query; determining from the related phrase bit vector which of the second related phrases are present in the user model; and forming a related phrase bit mask corresponding to the second related phrases that are present in the user model; weighting a plurality of scores of a corresponding plurality of the search results according to the identified one or more second related phrases; ranking the plurality of the search results for presentation to the user according to their weighted scores, to provide personalized search results; and presenting the personalized search results to the user. - View Dependent Claims (12)
-
-
13. A method of personalizing a search of a document collection to a user, the method comprising:
-
monitoring a plurality of documents accessed by a user; identifying a plurality of first phrases present in one or more of the accessed documents; for each of the identified first phrases, identifying one or more corresponding first related phrases, wherein the one or more first related phrases are related to the corresponding identified first phrase; storing a user model associated with the user, and comprising a plurality of cluster counts, each cluster count associated with a predetermined cluster that includes a plurality of first related phrases, and storing a count of a number of instances of first related phrases of the cluster appearing in a document accessed by the user; receiving a query from the user, the query including one or more second phrases; selecting search results comprising a plurality of documents responsive to the query; identifying, by operation of a processor configured to manipulate data within a computer system, one or more second related phrases that are related to the second phrase(s) of the query and that are present in the user model; weighting a plurality of scores of a corresponding plurality of the search results according to the cluster counts of the identified one or more second related phrases; ranking the plurality of the search results for presentation to the user according to their weighted scores, to provide personalized search results; and presenting the personalized search results to the user. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer readable storage medium storing a computer program executable by a processor for personalizing a search of a document collection to a user, the operations of the computer program comprising:
-
monitoring a plurality of documents accessed by a user; identifying a plurality of first phrases present in one or more of the accessed documents; for each of the identified first phrases, identifying one or more corresponding first related phrases, wherein the one or more first related phrases are related to the corresponding identified first phrase; storing a user model associated with the user, and comprising a plurality of the first related phrases; receiving a query from the user, the query including one or more second phrases; selecting search results comprising a plurality of documents responsive to the query; identifying, by operation of a processor configured to manipulate data within a computer system, one or more second related phrases that are related to the second phrase(s) of the query and that are present in the user model; weighting a plurality of scores of a corresponding plurality of the search results according to the cluster counts of the identified one or more second related phrases; ranking the plurality of the search results for presentation to the user according to their weighted scores, to provide personalized search results; and presenting the personalized search results to the user. - View Dependent Claims (22)
-
-
23. A computer readable storage medium storing a computer program executable by a processor for personalizing a search of a document collection to a user, the operations of the computer program comprising:
-
monitoring a plurality of documents accessed by a user; identifying a plurality of first phrases present in one or more of the accessed documents; for each of the identified first phrases, identifying one or more corresponding first related phrases, wherein the one or more first related phrases are related to the corresponding identified first phrase; storing a user model associated with the user, and comprising a plurality of cluster counts, each cluster count associated with a predetermined cluster that includes a plurality of the first related phrases, and storing a count of a number of instances of the first related phrases of the cluster appearing in a document accessed by the user; receiving a query from the user, the query including one or more second phrases; selecting search results comprising a plurality of documents responsive to the query; identifying, by operation of a processor configured to manipulate data within a computer system, one or more second related phrases that are related to the second phrase(s) of the query and that are present in the user model; weighting a plurality of scores of a corresponding plurality of the search results according to the cluster counts of the identified one or more second related phrases; ranking the plurality of the search results for presentation to the user according to their weighted scores, to provide personalized search results; and presenting the personalized search results to the user. - View Dependent Claims (24)
-
-
25. A computer implemented system for personalizing a search of a document collection to a user, comprising:
-
a user model associated with the user, stored in a storage medium and comprising a plurality of first related phrases contained in documents accessed by the user, wherein the first related phrases are identified as related to one or more first phrases in documents that have been accessed by a user; and a query processing system executed by a computer and configured to; receive a query from the user, wherein the query includes one or more second phrases; select search results comprising a plurality of documents responsive to the query, identify one or more second related phrases that are related to the second phrase(s) of the query and that are present in the user model, weight a plurality of scores of a corresponding plurality of the search results according to the identified one or more second related phrases, rank the plurality of the search results for presentation to the user according to their weighted scores, to provide personalized search results, and present the personalized search results to the user. - View Dependent Claims (26)
-
-
27. A computer implemented system for personalizing a search of a document collection to a user, comprising:
-
a user model associated with the user, stored in storage medium and comprising a plurality of cluster counts, each cluster count associated with a predetermined cluster that includes a plurality of related first phrases, and storing a count of a number of instances of related first phrases of the cluster appearing in a document accessed by the user; and a query processing system executed by a computer and configured to; receive a query from the user, wherein the query includes one or more second phrases; select search results comprising a plurality of documents responsive to the query, identify one or more second related phrases that are related to the second phrase(s) of the query and associated with cluster counts in the user model, weight a plurality of scores of a corresponding plurality of the search results according to the cluster counts of the identified one or more second related phrases, rank the plurality of the search results for presentation to the user according to their weighted scores, to provide personalized search results, and present the personalized search results to the user. - View Dependent Claims (28)
-
Specification