Identification of duplicates within an image space
First Claim
Patent Images
1. A method comprising:
- partitioning a plurality of images of an image space into a plurality of coarse clusters, the partitioning based at least in part on signatures of the plurality of images determined from compact descriptors of the plurality of images;
creating a refined cluster that includes one or more images of an individual coarse cluster based at least in part on pair-wise comparisons of the compact descriptors, from which the signatures are determined, of images of the individual coarse cluster;
identifying the refined cluster as a set of duplicate images; and
searching another coarse cluster of images for ones of the other coarse cluster to add to the refined cluster based at least in part on an average of the compact descriptors of the images of the refined cluster.
2 Assignments
0 Petitions
Accused Products
Abstract
Implementations for identifying duplicate images in an image space are described. An image space is partitioned into a plurality of coarse clusters based on signatures of the images within the image space. The signatures are determined from compact descriptors of the images. Refined clusters that include one or more images of an individual coarse cluster are created based on pair-wise comparisons of the compact descriptors of images in the coarse cluster, and the refined clusters are identified as sets of duplicate images. The refined clusters are grown by searching in similar coarse clusters for images to add to the refined clusters.
-
Citations
20 Claims
-
1. A method comprising:
-
partitioning a plurality of images of an image space into a plurality of coarse clusters, the partitioning based at least in part on signatures of the plurality of images determined from compact descriptors of the plurality of images; creating a refined cluster that includes one or more images of an individual coarse cluster based at least in part on pair-wise comparisons of the compact descriptors, from which the signatures are determined, of images of the individual coarse cluster; identifying the refined cluster as a set of duplicate images; and searching another coarse cluster of images for ones of the other coarse cluster to add to the refined cluster based at least in part on an average of the compact descriptors of the images of the refined cluster. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system, comprising:
-
memory; one or more processors; a partition module stored on the memory and executable by the one or more processors to partition an image space into a plurality of coarse clusters based at least in part on signatures of images within the image space, the signatures determined from compact descriptors of the plurality of images; a cluster module stored on the memory and executable by the one or more processors to create refined clusters within the plurality of coarse clusters based at least in part on pair-wise comparisons of the compact descriptors, from which the signatures are determined, of images within individual ones of the coarse clusters; a growth module stored on the memory and executable by the one or more processors to search similar coarse clusters for images to add to the refined clusters based at least in part on an average of the compact descriptors of the images of the refined clusters; and an output module stored on the memory and executable by the one or more processors to output the refined clusters as sets of duplicate clusters. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. One or more computer-readable storage media comprising a plurality of instructions executable by one or more processors of a computing system to cause the computing system to:
-
extract raw global features from a plurality of images of an image space, the raw global features including gray block features, edge directional histograms, and non-edge ratios; compress the raw global features into corresponding compact descriptors of the plurality of images; quantize the compact descriptors using mean values of dimensions of the compact descriptors to generate signatures for the plurality of images; partition the image space into a plurality of coarse clusters such that groups of images with matching signatures are placed together into coarse clusters; create one or more refined clusters within individual ones of the plurality of coarse clusters such that at least one of the one or more refined clusters include two or more images whose compact descriptors are within a threshold distance from one another; grow the refined clusters by searching similar coarse clusters for images whose compact descriptors are within another threshold distance from averages of the compact descriptors of the refined clusters; and output the refined clusters as sets of duplicate images. - View Dependent Claims (18, 19, 20)
-
Specification