Near duplicate images
First Claim
Patent Images
1. A computer-implemented method comprising:
- generating a plurality of feature vectors for each image in a collection of images, wherein each feature vector is associated with an image tile of an image, wherein each feature vector corresponds to one of a plurality of predetermined visual words and wherein generating a feature vector for a particular image in the collection of images comprises;
determining a feature region in the particular image;
computing the feature vector from the feature region in the particular image;
quantizing the feature vector to one of the plurality of visual words;
determining an image tile to which the feature region is located;
associating the visual word with the image tile for the feature region; and
classifying as near-duplicate images all images in the collection of images that share at least a threshold number of matching visual words associated with matching image tiles.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining image search results. One of the methods includes generating a plurality of feature vectors for each image in a collection of images, wherein each feature vector is associated with an image tile of an image, wherein each feature vector corresponds to one of a plurality of predetermined visual words. All images in the collection of images that share at least a threshold number of matching visual words associated with matching image tiles are classified as near-duplicate images.
-
Citations
18 Claims
-
1. A computer-implemented method comprising:
-
generating a plurality of feature vectors for each image in a collection of images, wherein each feature vector is associated with an image tile of an image, wherein each feature vector corresponds to one of a plurality of predetermined visual words and wherein generating a feature vector for a particular image in the collection of images comprises; determining a feature region in the particular image; computing the feature vector from the feature region in the particular image; quantizing the feature vector to one of the plurality of visual words; determining an image tile to which the feature region is located; associating the visual word with the image tile for the feature region; and classifying as near-duplicate images all images in the collection of images that share at least a threshold number of matching visual words associated with matching image tiles. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method comprising:
-
receiving a query image; obtaining a set of image search results for the query image; generating a plurality of feature vectors for the query image, wherein each feature vector is associated with an image tile of the query image, wherein each feature vector corresponds to one of a plurality of predetermined visual words; generating a plurality of feature vectors for each image identified by the image search results, wherein each feature vector is associated with an image tile of an image, and wherein generating a feature vector for a particular image identified by the image search results comprises; determining a feature region in the particular image; computing the feature vector from the feature region in the particular image; quantizing the feature vector to one of the plurality of visual words; determining an image tile to which the feature region is located; and associating the visual word with the image tile for the feature region; determining that one or more images in the image search results that share at least a threshold number of matching visual words associated with matching image tiles with the query image are near-duplicate images of the query image; and removing one or more near-duplicate images of the query image from the set of image search results. - View Dependent Claims (8, 9)
-
-
10. A system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; generating a plurality of feature vectors for each image in a collection of images, wherein each feature vector is associated with an image tile of an image, wherein each feature vector corresponds to one of a plurality of predetermined visual words, and wherein generating feature vector for a particular image in the collection of images comprises; determining a feature region in the particular image; computing the feature vector from the feature region in the particular image; quantizing the feature vector to one of the plurality of visual words; determining an image tile to which the feature region is located; and associating the visual word with the image tile for the feature region; and classifying as near-duplicate images all images in the collection of images that share at least a threshold number of matching visual words associated with matching image tiles. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving a query image; obtaining a set of image search results for the query image; generating a plurality of feature vectors for the query image, wherein each feature vector is associated with an image tile of the query image, wherein each feature vector corresponds to one of a plurality of predetermined visual words; generating a plurality of feature vectors for each image identified by the image search results, wherein each feature vector is associated with an image tile of an image, and wherein generating a feature vector for a particular image identified by the image search results comprises; determining a feature region in the particular image; computing the feature vector from the feature region in the particular image; quantizing the feature vector to one of the plurality of visual words; determining an image tile to which the feature region is located; and associating the visual word with the image tile for the feature region; determining that one or more images in the image search results that share at least a threshold number of matching visual words associated with matching image tiles with the query image are near-duplicate images of the query image; and removing one or more near-duplicate images of the query image from the set of image search results. - View Dependent Claims (17, 18)
-
Specification