System and method for encoding document ranking vectors
First Claim
1. A method of processing information related to documents in a collection of linked documents, the method comprising:
- quantizing by a data-processing system, for each respective document in all or a portion of said collection, an auxiliary page ranking vector associated with the respective document, wherein each said auxiliary page ranking vector comprises a plurality of attributes, and wherein each attribute is quantized in an independent manner, and wherein said quantizing uses a first quantizer to quantize values for a first attribute in the auxiliary page ranking vectors associated with said collection, and wherein the values x for said first attribute in the auxiliary page ranking vectors associated with said collection are distributed in a power-law distribution m, and wherein the values x are transformed with a first function F1(x) such that the transformed values become uniformly distributed, and wherein said first quantizer partitions the plurality of transformed first values into a plurality of uniformly spaced cells;
receiving by the data-processing system a search query comprising one or more search terms;
identifying by the data-processing system, using a document index that represents said collection of linked documents, a plurality of documents, wherein each document in said identified plurality of documents includes at least one term that matches a search term in said search query; and
ranking by the data-processing system said plurality of documents using said auxiliary page vectors, wherein said ranking comprises;
(i) ranking by the data-processing system, for each respective attribute in said plurality of attributes, said plurality of documents to form an intermediate rank order, and(ii) aggregating by the data-processing system each said intermediate rank order to generate a final rank order for said plurality of documents;
wherein
1 Assignment
0 Petitions
Accused Products
Abstract
A method of processing information related to documents in a collection of linked documents. For each respective document in all or a portion of said collection, one or more auxiliary page ranking vectors associated with the respective document are quantized. A search query comprising one or more search terms is received. Using a document index that represents said collection of linked documents, a plurality of documents is identified. Each document in the identified plurality of documents includes at least one term that matches a search term in the search query. For one or more respective documents in the plurality of documents, one or more of the auxiliary page ranking vectors associated with the respective document are decoded. The plurality of documents are then ranked using the decoded auxiliary page vectors.
-
Citations
9 Claims
-
1. A method of processing information related to documents in a collection of linked documents, the method comprising:
-
quantizing by a data-processing system, for each respective document in all or a portion of said collection, an auxiliary page ranking vector associated with the respective document, wherein each said auxiliary page ranking vector comprises a plurality of attributes, and wherein each attribute is quantized in an independent manner, and wherein said quantizing uses a first quantizer to quantize values for a first attribute in the auxiliary page ranking vectors associated with said collection, and wherein the values x for said first attribute in the auxiliary page ranking vectors associated with said collection are distributed in a power-law distribution m, and wherein the values x are transformed with a first function F1(x) such that the transformed values become uniformly distributed, and wherein said first quantizer partitions the plurality of transformed first values into a plurality of uniformly spaced cells; receiving by the data-processing system a search query comprising one or more search terms; identifying by the data-processing system, using a document index that represents said collection of linked documents, a plurality of documents, wherein each document in said identified plurality of documents includes at least one term that matches a search term in said search query; and ranking by the data-processing system said plurality of documents using said auxiliary page vectors, wherein said ranking comprises; (i) ranking by the data-processing system, for each respective attribute in said plurality of attributes, said plurality of documents to form an intermediate rank order, and (ii) aggregating by the data-processing system each said intermediate rank order to generate a final rank order for said plurality of documents; wherein - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of processing information related to documents in a collection of linked documents, the method comprising:
-
(a) quantizing by a data-processing system, for each respective document in all or a portion of said collection, an auxiliary page ranking vector associated with the respective document, wherein each said auxiliary page ranking vector comprises a plurality of attributes, and wherein each attribute is quantized in an independent manner, and wherein said quantizing uses a first quantizer to quantize values for a first attribute in the auxiliary page ranking vectors associated with said collection; (b) estimating by the data-processing system a distortion measure for the quantized values of the first attribute, wherein said distortion measure is given by
-
-
7. A method of processing information related to documents in a collection of linked documents, the method comprising:
-
(a) quantizing by a data-processing system, for each respective document in all or a portion of said collection, an auxiliary page ranking vector associated with the respective document, wherein each said auxiliary page ranking vector comprises a plurality of attributes, and wherein each attribute is quantized in an independent manner, and wherein said quantizing uses a first quantizer to quantize values for a first attribute in the auxiliary page ranking vectors associated with said collection; (b) estimating by the data-processing system a distortion measure for the quantized values of the first attribute, wherein said distortion measure is given by Distortion(qj,R)=F(Z), and wherein - View Dependent Claims (8)
-
-
9. A method of processing information related to documents in a collection of linked documents, the method comprising:
-
(a) quantizing by a data-processing system, for each respective document in all or a portion of said collection, an auxiliary page ranking vector associated with the respective document, wherein each said auxiliary page ranking vector comprises a plurality of attributes, and wherein each attribute is quantized in an independent manner, and wherein said quantizing uses a first quantizer to quantize values for a first attribute in the auxiliary page ranking vectors associated with said collection; (b) estimating by the data-processing system a distortion measure for the quantized values of the first attribute, wherein said distortion measure is given by Distortion(qj, R)=Fi(Xi), qj is the identity of first quantizer; R is said plurality of documents that include at least one matching term; Xi is the number of documents mapped to cell i, and Fi(Xi) for i, . . . , n are different functions; (c) receiving by the data-processing system a search query comprising one or more search terms; (d) identifying by the data-processing system, using a document index that represents said collection of linked documents, a plurality of documents, wherein each document in said identified plurality of documents includes at least one term that matches a search term in said search query; and (e) ranking by the data-processing system said plurality of documents using said auxiliary page vectors, wherein said ranking comprises; (i) ranking by the data-processing system, for each respective attribute in said plurality of attributes, said plurality of documents to form an intermediate rank order, and (ii) aggregating by the data-processing system each said intermediate rank order to generate a final rank order for said plurality of documents.
-
Specification