Modifying ranking data based on document changes
First Claim
1. A computer-implemented method, comprising:
- receiving a query and a current version of a document;
receiving quality of result data for a plurality of versions of the document and the query, the quality of result data specifying a respective version-specific quality of result statistic for each of the versions of the document with respect to the query;
calculating a weight for the version-specific quality of result statistics corresponding to each version of the document, wherein the weight for a particular version of the document is determined at least in part on an estimate of a difference between the particular version and the current version of the document, and wherein calculating the weight for a particular version of the document comprises;
obtaining a representation of the particular version of the document, wherein the representation is a first time distribution of shingles,calculating a difference score by comparing the first time distribution of shingles representing the particular version of the document to a second time distribution of shingles representing the current version of the document, wherein each shingle is a contiguous subsequence of one or more tokens in the document, and wherein each shingle is associated with a particular time that the shingle is first observed in a version of the document such that a distribution of the times associated with the shingles in a version of the document corresponds to the representation of the version of the document, andusing the difference score to calculate a corresponding weight for the particular version of the document;
determining a weighted overall quality of result statistic for the document with respect to the query, wherein determining the weighted overall quality of result statistic comprises weighting each version-specific quality of result statistic with the calculated weight and combining the weighted version-specific quality of result statistics; and
associating the weighted overall quality of result statistic with the document.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media for determining a weighted overall quality of result statistic for a document. One method includes receiving quality of result data for a query and a plurality of versions of a document, determining a weighted overall quality of result statistic for the document with respect to the query including weighting each version specific quality of result statistic and combining the weighted version-specific quality of result statistics, wherein each quality of result statistic is weighted by a weight determined from at least a difference between content of a reference version of the document and content of the version of the document corresponding to the version specific quality of result statistic, and storing the weighted overall quality of result statistic and data associating the query and the document with the weighted overall quality of result statistic.
-
Citations
39 Claims
-
1. A computer-implemented method, comprising:
-
receiving a query and a current version of a document; receiving quality of result data for a plurality of versions of the document and the query, the quality of result data specifying a respective version-specific quality of result statistic for each of the versions of the document with respect to the query; calculating a weight for the version-specific quality of result statistics corresponding to each version of the document, wherein the weight for a particular version of the document is determined at least in part on an estimate of a difference between the particular version and the current version of the document, and wherein calculating the weight for a particular version of the document comprises; obtaining a representation of the particular version of the document, wherein the representation is a first time distribution of shingles, calculating a difference score by comparing the first time distribution of shingles representing the particular version of the document to a second time distribution of shingles representing the current version of the document, wherein each shingle is a contiguous subsequence of one or more tokens in the document, and wherein each shingle is associated with a particular time that the shingle is first observed in a version of the document such that a distribution of the times associated with the shingles in a version of the document corresponds to the representation of the version of the document, and using the difference score to calculate a corresponding weight for the particular version of the document; determining a weighted overall quality of result statistic for the document with respect to the query, wherein determining the weighted overall quality of result statistic comprises weighting each version-specific quality of result statistic with the calculated weight and combining the weighted version-specific quality of result statistics; and associating the weighted overall quality of result statistic with the document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system comprising:
one or more computers configured to perform operations, the operations comprising; receiving a query and a current version of a document; receiving quality of result data for a plurality of versions of the document and the query, the quality of result data specifying a respective version-specific quality of result statistic for each of the versions of the document with respect to the query; calculating a weight for the version-specific quality of result statistics corresponding to each version of the document, wherein the weight for a particular version of the document is determined at least in part on an estimate of a difference between the particular version and the current version of the document, and wherein calculating the weight for a particular version of the document comprises; obtaining a representation of the particular version of the document, wherein the representation is a first time distribution of shingles, calculating a difference score by comparing the first time distribution of shingles representing the particular version of the document to a second time distribution of shingles representing the current version of the document, wherein each shingle is a contiguous subsequence of one or more tokens in the document, and wherein each shingle is associated with a particular time that the shingle is first observed in a version of the document such that a distribution of the times associated with the shingles in a version of the document corresponds to the representation of the version of the document, and using the difference score to calculate a corresponding weight for the particular version of the document; determining a weighted overall quality of result statistic for the document with respect to the query, wherein determining the weighted overall quality of result statistic comprises weighting each version-specific quality of result statistic with the calculated weight and combining the weighted version-specific quality of result statistics; and associating the weighted overall quality of result statistic with the document. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
27. A computer-readable medium having instructions stored thereon, which, when executed by a processor, cause the processor to perform operations comprising:
-
receiving a query and a current version of a document; receiving quality of result data for a plurality of versions of the document and the query, the quality of result data specifying a respective version-specific quality of result statistic for each of the versions of the document with respect to the query; calculating a weight for the version-specific quality of result statistics corresponding to each version of the document, wherein the weight for a particular version of the document is determined at least in part on an estimate of a difference between the particular version and the current version of the document, and wherein calculating the weight for a particular version of the document comprises; obtaining a representation of the particular version of the document, wherein the representation is a first time distribution of shingles, calculating a difference score by comparing the first time distribution of shingles representing the particular version of the document to a second time distribution of shingles representing the current version of the document, wherein each shingle is a contiguous subsequence of one or more tokens in the document, and wherein each shingle is associated with a particular time that the shingle is first observed in a version of the document such that a distribution of the times associated with the shingles in a version of the document corresponds to the representation of the version of the document, and using the difference score to calculate a corresponding weight for the particular version of the document; determining a weighted overall quality of result statistic for the document with respect to the query, wherein determining the weighted overall quality of result statistic comprises weighting each version-specific quality of result statistic with the calculated weight and combining the weighted version-specific quality of result statistics; and associating the weighted overall quality of result statistic with the document. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
Specification