Systems and methods for clustering of near-duplicate images in very large image collections

US 10,504,002 B2
Filed: 07/30/2017
Issued: 12/10/2019
Est. Priority Date: 07/30/2017
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for clustering a plurality of images, the computer-implemented method being performed in connection with a computerized system comprising a central processing unit and a memory, the computer-implemented method comprising:

a. generating a vocabulary of visual words in the plurality of images;

b. extracting image features for image key points for each of the plurality of images;

c. based on the extracted image features, creating an index pointing from the visual words in the vocabulary to images from the plurality of images, which contain these visual words;

d. using the created index to collect all other images of the plurality of images that share at least one visual word with a selected image and determining a number of shared visual words;

e. performing a geometric verification to verify whether the shared visual words are located at same locations in the selected image and the other images of the plurality of images and taking a fraction of verified shared visual words to all shared visual words as a similarity measure; and

f. clustering the plurality of images hierarchically based on the similarity measure.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Detection of near-duplicate images is important for detecting the reuse of copyrighted material. Some applications require the clustering of near-duplicates instead of the comparison to an original. Representing images as bags of visual words is the first step for our clustering approach. An inverted index points from visual words to all the images containing that visual word. In the next step, matches are geometrically verified in pairs of images that share a large fraction of their visual words. Geometric verification may use affine, perspective, or other transformations. The verification step provides a similarity measure based on the fraction of the matching image points and on their distributions in the compared images. The resulting distance matrix is very sparse because most images in the collection are not compared to each other. This distance matrix is used as input for modified agglomerative hierarchical clustering approach that can handle a sparse distance matrix.

Citations

24 Claims

1. A computer-implemented method for clustering a plurality of images, the computer-implemented method being performed in connection with a computerized system comprising a central processing unit and a memory, the computer-implemented method comprising:
- a. generating a vocabulary of visual words in the plurality of images;
  
  b. extracting image features for image key points for each of the plurality of images;
  
  c. based on the extracted image features, creating an index pointing from the visual words in the vocabulary to images from the plurality of images, which contain these visual words;
  
  d. using the created index to collect all other images of the plurality of images that share at least one visual word with a selected image and determining a number of shared visual words;
  
  e. performing a geometric verification to verify whether the shared visual words are located at same locations in the selected image and the other images of the plurality of images and taking a fraction of verified shared visual words to all shared visual words as a similarity measure; and
  
  f. clustering the plurality of images hierarchically based on the similarity measure.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The computer-implemented method of claim 1, wherein the vocabulary of visual words is generated from a set of image features extracted from a collection of representative images.
  - 3. The computer-implemented method of claim 2, wherein the collection of representative images comprises at least one million images.
  - 4. The computer-implemented method of claim 1, wherein generating the vocabulary of visual words comprises clustering similar feature vectors and representing all feature vectors similar to each cluster centroid with a visual word and adding that visual word to the vocabulary of visual words for each cluster.
  - 5. The computer-implemented method of claim 1, wherein the image features for image key points are extracted for the each of the plurality of images using a scale-invariant feature transform (SIFT).
  - 6. The computer-implemented method of claim 1, wherein the index pointing from the visual words in the vocabulary to images from the plurality of images is an inverted index.
  - 7. The computer-implemented method of claim 1, further comprising sorting the collected other images of the plurality of images that share at least one visual word with a selected image based on the number of the shared visual words.
  - 8. The computer-implemented method of claim 7, further comprising selecting a predetermined number of the collected other images with top numbers of the shared visual words.
  - 9. The computer-implemented method of claim 1, wherein performing the geometric verification comprises determining an affine transformation that maps at least a portion of the selected image to another image of the plurality of images.
  - 10. The computer-implemented method of claim 1, wherein performing the geometric verification comprises determining a perspective transformation that maps at least a portion of the selected image to another image of the plurality of images.
  - 11. The computer-implemented method of claim 1, further comprising verifying uniformity of distribution of the shared visual words over the other images and rejecting the other images with uniformity of distribution of the shared visual words below a predetermined threshold.
  - 12. The computer-implemented method of claim 11, wherein the verifying uniformity of distribution of the shared visual words over the other images comprises dividing the other images into a coarse two-dimensional grid comprising a plurality of cells, and for each grid cell, determining a fraction of matching visual words to the total visual words in that cell, and performing a statistical test for sufficient uniformity of the distribution of the shared visual words among the grid cells.
  - 13. The computer-implemented method of claim 1, wherein the clustering the plurality of images is performed using a modified complete-linkage agglomerative hierarchical clustering algorithm.
  - 14. The computer-implemented method of claim 1, wherein the clustering the plurality of images is performed using a sparse distance matrix calculated based on the similarity measure.

15. A computerized system for clustering a plurality of images, the computerized system comprising a central processing unit and a memory storing a set of computer-executable instructions for:
- a. generating a vocabulary of visual words in the plurality of images;
  
  b. extracting image features for image key points for each of the plurality of images;
  
  c. based on the extracted image features, creating an index pointing from the visual words in the vocabulary to images from the plurality of images, which contain these visual words;
  
  d. using the created index to collect all other images of the plurality of images that share at least one visual word with a selected image and determining a number of shared visual words;
  
  e. performing a geometric verification to verify whether the shared visual words are located at same locations in the selected image and the other images of the plurality of images and taking a fraction of verified shared visual words to all shared visual words as a similarity measure; and
  
  f. clustering the plurality of images hierarchically based on the similarity measure.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The computerized system of claim 15, wherein the vocabulary of visual words is generated from a set of image features extracted from a collection of representative images.
  - 17. The computerized system of claim 15, wherein the collection of representative images comprises at least one million images.
  - 18. The computerized system of claim 15, wherein generating the vocabulary of visual words comprises clustering similar feature vectors and representing all feature vectors similar to each cluster centroid with a visual word and adding that visual word to the vocabulary of visual words for each cluster.
  - 19. The computerized system of claim 15, wherein the image features for image key points are extracted for the each of the plurality of images using a scale-invariant feature transform (SIFT).

20. A non-transitory computer-readable medium embodying a set of computer-executable instructions, which, when executed in a computerized system comprising a central processing unit and a memory, cause the computerized system to perform a method for clustering a plurality of images, the method comprising:
- a. generating a vocabulary of visual words in the plurality of images;
  
  b. extracting image features for image key points for each of the plurality of images;
  
  c. based on the extracted image features, creating an index pointing from the visual words in the vocabulary to images from the plurality of images, which contain these visual words;
  
  d. using the created index to collect all other images of the plurality of images that share at least one visual word with a selected image and determining a number of shared visual words;
  
  e. performing a geometric verification to verify whether the shared visual words are located at same locations in the selected image and the other images of the plurality of images and taking a fraction of verified shared visual words to all shared visual words as a similarity measure; and
  
  f. clustering the plurality of images hierarchically based on the similarity measure.

21. A computer-implemented method for clustering a plurality of content items, the computer-implemented method being performed in connection with a computerized system comprising a central processing unit and a memory, the computer-implemented method comprising:
- a. generating a vocabulary of words in the plurality of content items;
  
  b. extracting features from the plurality of content items;
  
  c. based on the extracted features, creating an index pointing from the words in the vocabulary to content items from the plurality of content items, which contain these words;
  
  d. using the created index to collect all other content items of the plurality of content items that share at least one word with a selected content item and determining a number of shared words;
  
  e. performing a content verification to verify whether the shared words are located at same locations in the selected content item and the other content items of the plurality of content items and taking a fraction of verified shared words to all shared words as a similarity measure; and
  
  f. clustering the plurality of content items hierarchically based on the similarity measure.
- View Dependent Claims (22, 23, 24)
- - 22. The computer-implemented method of claim 21, wherein the content items in the plurality of content items are texts.
  - 23. The computer-implemented method of claim 21, wherein the content items in the plurality of content items are audio recordings.
  - 24. The computer-implemented method of claim 21, wherein the content items in the plurality of content items are videos.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Fujifilm Business Innovation Corp. (Fujifilm Holdings Corporation)
Original Assignee
Fuji Xerox Company Limited (Fujifilm Holdings Corporation)
Inventors
Girgensohn, Andreas
Primary Examiner(s)
Osifade, Idowu O

Application Number

US15/663,815
Publication Number

US 20190034758A1
Time in Patent Office

863 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 16/55   Clustering; Classification

G06F 18/22   Matching criteria, e.g. pro...

G06F 18/231   Hierarchical techniques, i....

G06F 18/24   Classification techniques

G06V 10/462   Salient features, e.g. scal...

G06V 10/464   using a plurality of salien...

G06V 10/50   by performing operations wi...

G06V 10/761   Proximity, similarity or di...

G06V 10/7625   Hierarchical techniques, i....

Systems and methods for clustering of near-duplicate images in very large image collections

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for clustering of near-duplicate images in very large image collections

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links