SYSTEM AND METHOD FOR IDENTIFYING SIMILARITIES AMONG OBJECTS IN A COLLECTION
First Claim
1. A method for calculating the similarity between two objects in a collection of objects, wherein each object is associated with at least one multi-dimensional vector representative of a feature of the object, comprising the steps of:
- identifying a first vector corresponding to a first feature of a first object and a second vector corresponding to a first feature of a second object; and
computing a first distance metric between the first vector and the second vector.
9 Assignments
0 Petitions
Accused Products
Abstract
A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users'"'"' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.
218 Citations
27 Claims
-
1. A method for calculating the similarity between two objects in a collection of objects, wherein each object is associated with at least one multi-dimensional vector representative of a feature of the object, comprising the steps of:
-
identifying a first vector corresponding to a first feature of a first object and a second vector corresponding to a first feature of a second object; and
computing a first distance metric between the first vector and the second vector. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for calculating the similarity between two documents in a collection of documents, wherein each document is associated with at least two multi-dimensional vectors representative of a color complexity feature of the object, comprising the steps of:
-
identifying a first horizontal complexity vector corresponding to a first document, a first vertical complexity vector corresponding to the first document, a second horizontal complexity vector corresponding to a second document, and a second vertical complexity vector corresponding to the second document; and
computing a distance metric between the first document and the second document, wherein the distance metric comprises a normalized sum of a cosine similarity measure between the first horizontal complexity vector and the second horizontal complexity vector, and between the first vertical complexity vector and the second vertical complexity vector.
-
-
14. A method for calculating the similarity between two objects in a collection of objects, wherein each object is associated with a plurality of multi-dimensional vectors representative of a plurality of corresponding features of the object, comprising the steps of:
-
for each feature, identifying a first vector corresponding to a first object and a second vector corresponding to a second object, for each feature, computing a distance metric between the first vector and the second vector; and
summing the distance metrics for each feature into an aggregate distance metric. - View Dependent Claims (15)
-
-
16. A method for calculating the similarity between two objects in a collection of objects, wherein each object is associated with a plurality of multi-dimensional vectors representative of a feature of the object, comprising the steps of:
-
identifying a first set of vectors corresponding to a first object and a second set of vectors corresponding to a second object, wherein the number of vectors in the first set is equal to the number of vectors in the second set;
computing a distance metric between each vector in the first set and a corresponding vector in the second set; and
summing the distance metrics into a composite distance metric. - View Dependent Claims (17)
-
-
18. A method for calculating the similarity between two users in a user population, wherein each user is associated with a multi-dimensional vector representative of a user feature, comprising the steps of:
-
identifying a first vector corresponding to a first user and a second vector corresponding to a second user; and
computing a first distance metric between the first vector and the second vector. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27)
-
Specification