Object similarity search in high-dimensional vector spaces
First Claim
1. A method in a computing device with a processor and a memory for identifying an object similar to a target object, the method comprising:
- providing a collection of objects, the objects being images, each image in the collection represented by a plurality of values, each value corresponding to a feature of a plurality of features, each feature representing a characteristic of the image;
for each of the plurality of features of objects in the collection, generating by the computer system for the feature a cluster index data structure for the collection of objects, the cluster index data structure defining clusters of objects that are feature similar based on the values of feature, such that for each feature, the objects in the collection are clustered differently based on the values for that feature;
for each of the plurality of features of the target object, identifying by the computer system, from the cluster index data structure for that feature, clusters of candidate objects that are feature similar to the target object based on the values of that feature; and
for candidate objects, indicating by the computer system similarity of the candidate object to the target object based on the number of identified clusters containing the candidate object.
2 Assignments
0 Petitions
Accused Products
Abstract
An object search system generates a hierarchical clustering of objects of a collection based on similarity of the objects. The object search system generates a separate hierarchical clustering of objects for multiple features of the objects. To identify objects similar to a target object, the object search system first generates a feature vector for the target object. For each feature of the feature vector, the object search system uses the hierarchical clustering of objects to identify the cluster of objects that is most “feature similar” to that feature of the target object. The object search system indicates the similarity of each candidate object based on the features for which the candidate object is similar.
39 Citations
19 Claims
-
1. A method in a computing device with a processor and a memory for identifying an object similar to a target object, the method comprising:
-
providing a collection of objects, the objects being images, each image in the collection represented by a plurality of values, each value corresponding to a feature of a plurality of features, each feature representing a characteristic of the image; for each of the plurality of features of objects in the collection, generating by the computer system for the feature a cluster index data structure for the collection of objects, the cluster index data structure defining clusters of objects that are feature similar based on the values of feature, such that for each feature, the objects in the collection are clustered differently based on the values for that feature; for each of the plurality of features of the target object, identifying by the computer system, from the cluster index data structure for that feature, clusters of candidate objects that are feature similar to the target object based on the values of that feature; and for candidate objects, indicating by the computer system similarity of the candidate object to the target object based on the number of identified clusters containing the candidate object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-readable storage medium encoded with instructions for controlling a computing device to generate a cluster index data structure for each of a plurality of features for a collection of images for use in identifying similar images, by a method comprising:
for each of the plurality of features of the images, generating the cluster index data structure for the feature defining clusters of images that are feature similar based on the feature by; for each image of the collection, generating a hash code representation of the feature of the image; generating high-level clusters of the images based on similarity between the hash codes of the images; and for each high-level cluster, generating initial low-level clusters of images within the high-level cluster based on similarity of the hash codes of the images; and merging the generated low-level clusters based on similarity of hash codes between images of the low-level clusters to be merged until a merging termination criterion is satisfied. - View Dependent Claims (12, 13, 14, 15, 16)
-
17. A computing system for identifying images of a collection that are similar to a target image, comprising:
-
a data structure storing, for each a plurality of features of images, a cluster index data structure defining clusters of images that are feature similar based on the feature; a memory storing computer-executable instructions of a component that, for each feature, identifies, from the cluster index data structure for that feature, candidate images that are feature similar to the target image based on that feature; and a component that, for candidate images, indicates similarity of the candidate image to the target image based on the features for which the candidate image is feature similar to the target image; and a processor for executing the computer-executable instructions stored in the memory. - View Dependent Claims (18, 19)
-
Specification