Method and apparatus for constructing a compact similarity structure and for using the same in analyzing document relevance
First Claim
1. A method for constructing a data structure containing information about levels of similarity between pairs of documents of a set of documents, the method comprising:
- obtaining similarity values for pairs of documents of the set of documents;
determining whether each of the similarity values is greater than or equal to a threshold similarity value; and
for each similarity value that is greater than the threshold similarity value, storing the similarity value in the data structure.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-readable medium comprises data structure for providing information about levels of similarity between pairs of N documents. The data structure comprises a plurality of entries of similarity values representing levels of similarity for a plurality of pairs of the documents. Each of the similarity values represents a level of similarity of one document of a given pair relative to the other document of the given pair. The similarity value of each entry is greater than a threshold similarity value that is greater than zero. The plurality of similarity-value entries are fewer than N2−N in number if the similarity values are asymmetric with regard to document pairing, and the plurality of similarity-value entries are fewer than
-
Citations
22 Claims
-
1. A method for constructing a data structure containing information about levels of similarity between pairs of documents of a set of documents, the method comprising:
-
obtaining similarity values for pairs of documents of the set of documents;
determining whether each of the similarity values is greater than or equal to a threshold similarity value; and
for each similarity value that is greater than the threshold similarity value, storing the similarity value in the data structure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for retrieving similarity values from a data structure for a set of documents, comprising:
-
accessing the data structure to determine whether the data structure contains an explicit entry for a similarity value for a given pair of documents;
if the data structure contains an explicit entry of the similarity value for the given pair of documents, retrieving the similarity value; and
if the data structure does not contain an explicit entry of the similarity value for the given pair of documents, retrieving a default similarity value from the data structure or from another memory location. - View Dependent Claims (10, 11, 12, 13)
-
-
14. An apparatus for constructing a data structure containing information about levels of similarity between pairs of documents of a set of documents, comprising:
-
a memory; and
a processing unit coupled to the memory, wherein the processing unit is configured to execute the steps of;
obtaining similarity values for pairs of documents of the set of documents;
determining whether each of the similarity values is greater than or equal to a threshold similarity value; and
for each similarity value that is greater than the threshold similarity value, storing the similarity value in the data structure. - View Dependent Claims (15, 16, 17)
-
-
18. An apparatus for retrieving similarity values from a data structure for a set of documents, comprising:
-
a memory; and
a processing unit coupled to the memory, wherein the processing unit is configured to execute the steps of;
accessing the data structure to determine whether the data structure contains an explicit entry for a similarity value for a given pair of documents;
if the data structure contains an explicit entry of the similarity value for the given pair of documents, retrieving the similarity value; and
if the data structure does not contain an explicit entry of the similarity value for the given pair of documents, retrieving a default similarity value from the data structure or from another memory location.
-
-
19. The apparatus of claim 18, wherein the data structure comprises a plurality of entries of the similarity values, and wherein the plurality of entries of the similarity values are fewer than N2−
- N in number if the similarity values are asymmetric with regard to document pairing and wherein the plurality of entries of the similarity values are fewer than
in number if the similarity values are symmetric with regard to document pairing. - View Dependent Claims (20, 21, 22)
- N in number if the similarity values are asymmetric with regard to document pairing and wherein the plurality of entries of the similarity values are fewer than
-
19-1. A computer-readable medium having stored thereon a data structure for providing information about levels of similarity between pairs of documents of a set of documents, the documents being N in number, the data structure comprising:
-
a plurality of entries of similarity values representing levels of similarity for a plurality of pairs of said documents, each of said similarity values representing a level of similarity of one document of a given pair relative to the other document of the given pair, wherein the similarity value of each entry is greater than a threshold similarity value that is greater than zero, and wherein the plurality of entries of similarity values are fewer than N2−
N in number if the similarity values are asymmetric with regard to document pairing and wherein the plurality of entries of similarity values are fewer thanin number if the similarity values are symmetric with regard to document pairing.
-
Specification