Scalable image matching

US 9,280,560 B1
Filed: 12/18/2013
Issued: 03/08/2016
Est. Priority Date: 12/18/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method, comprising:

under the control of one or more computer systems configured with executable instructions,obtaining images of a plurality of inventory items, each of the inventory items including one or more features;

extracting, from each image, a plurality of feature descriptors of each respective inventory item represented in the images;

clustering the feature descriptors into clusters and assigning a cluster center to represent each cluster of feature descriptors, each cluster center being of a first file size;

compressing each cluster center from the first file size to a second file size, the first file size being larger than the second file size;

storing the compressed cluster centers in a database for retrieval and use in image matching;

assigning a visual word to each cluster center to generate a vocabulary of visual words describing the features of each respective inventory item represented in the images;

indexing the visual words into an index storing information for each visual word and respective corresponding images;

receiving a query image from a client computing device;

extracting query feature descriptors from the query image;

assigning a query visual word to each of the extracted feature descriptors;

comparing one or more query visual words from the query image to at least a subset of the visual words in the index to identify a set of closest matching inventory images that at least partially match the query image based at least in part on a respective number of query visual words matching a respective number of visual words in the index, the set of closest matching inventory images being ranked by a matching score;

retrieving a set of compressed cluster centers for each of the set of closest matching inventory images from the database;

performing geometric verification of the set of closest matching inventory images by comparing at least a subset of the query feature descriptors to the set of cluster centers for each of the set of closest matching inventory images;

ranking the set of closest matching inventory images based on the matching score, the matching score determined using parameters from a machine learned process, the parameters produced by the machine learning process using one or more training images to at least partially compensate for inaccuracy caused by compressing the cluster centers; and

suggesting a highest ranking image as matching the query image.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various embodiments may increase scalability of image representations stored in a database for use in image matching and retrieval. For example, a system providing image matching can obtain images of a number of inventory items, extract features from each image using a feature extraction algorithm, and transform the same into their feature descriptor representations. These feature descriptor representations can be subsequently stored and used to compare against query images submitted by users. Though the size of each feature descriptor representation isn'"'"'t particularly large, the total number of these descriptors requires a substantial amount of storage space. Accordingly, feature descriptor representations are compressed to minimize storage and, in one example, machine learning can be used to compensate for information lost as a result of the compression.

39 Citations

View as Search Results

19 Claims

1. A computer-implemented method, comprising:
- under the control of one or more computer systems configured with executable instructions,obtaining images of a plurality of inventory items, each of the inventory items including one or more features;
  
  extracting, from each image, a plurality of feature descriptors of each respective inventory item represented in the images;
  
  clustering the feature descriptors into clusters and assigning a cluster center to represent each cluster of feature descriptors, each cluster center being of a first file size;
  
  compressing each cluster center from the first file size to a second file size, the first file size being larger than the second file size;
  
  storing the compressed cluster centers in a database for retrieval and use in image matching;
  
  assigning a visual word to each cluster center to generate a vocabulary of visual words describing the features of each respective inventory item represented in the images;
  
  indexing the visual words into an index storing information for each visual word and respective corresponding images;
  
  receiving a query image from a client computing device;
  
  extracting query feature descriptors from the query image;
  
  assigning a query visual word to each of the extracted feature descriptors;
  
  comparing one or more query visual words from the query image to at least a subset of the visual words in the index to identify a set of closest matching inventory images that at least partially match the query image based at least in part on a respective number of query visual words matching a respective number of visual words in the index, the set of closest matching inventory images being ranked by a matching score;
  
  retrieving a set of compressed cluster centers for each of the set of closest matching inventory images from the database;
  
  performing geometric verification of the set of closest matching inventory images by comparing at least a subset of the query feature descriptors to the set of cluster centers for each of the set of closest matching inventory images;
  
  ranking the set of closest matching inventory images based on the matching score, the matching score determined using parameters from a machine learned process, the parameters produced by the machine learning process using one or more training images to at least partially compensate for inaccuracy caused by compressing the cluster centers; and
  
  suggesting a highest ranking image as matching the query image.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The computer-implemented method of claim 1, wherein ranking the set of closest matching inventory images includes predicting whether a respective compressed cluster center is an inlier or an outlier using a classifier.
  - 3. The computer-implemented method of claim 2, wherein the classifier uses at least one of a scale difference or consistent orientation as an additional input for reranking the set of closest matching inventory images.
  - 4. The computer-implemented method of claim 1, further comprising:
    - determining the matching score between the query image and the set of closest matching inventory images, the scoring being based at least in part on a similarity between feature descriptors of the query image and corresponding cluster centers of the set of closest matching inventory images.
  - 5. The computer-implemented method of claim 1, wherein compressing the feature descriptors includes keeping a number of features constant while compressing data of the number of features.

6. A computer-implemented method, comprising:
- under the control of one or more computer systems configured with executable instructions,assigning one or more visual words to each of one or more compressed cluster centers, the one or more visual words corresponding to one or more feature descriptors associated with a respective compressed cluster center, each compressed cluster center being determined by clustering database feature descriptors for each of a plurality of database images into clusters, assigning a cluster center to represent each respective cluster, and compressing data of each cluster center;
  
  receiving a query image from a client computing device;
  
  extracting feature descriptors from the query image;
  
  comparing at least a portion of the extracted feature descriptors to a set of compressed cluster centers corresponding to a set of closest matching database images to determine a matching score, the set of closet matching database images determined based at least on a number of the one or more visual words matching at least the portion of the extracted feature descriptors;
  
  ranking the set of closest matching database images based on the matching score, the matching score determined using parameters from a machine learned process, the parameters produced by the machine learning process using one or more training images to at least partially compensate for inaccuracy caused by compressing the cluster centers; and
  
  selecting a highest ranking database image of the set of closest matching database images as a match for the query image.
- View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
- - 7. The computer-implemented method of claim 6, whereinadjusting the ranking of set of closest matching database images using parameters from a machine learned process, wherein adjusting the ranking of the set of closest matching database images includes predicting whether a respective compressed cluster center is an inlier or an outlier using a classifier rule.
  - 8. The computer-implemented method of claim 6, further comprising:
    - assigning visual words to each feature descriptor of the query image; and
      
      searching, using the assigned visual words of the query image, a plurality of database visual words to identify the set of closest matching database images that at least partially match the query image, wherein each visual word corresponds to a feature of a database image of a plurality of database images.
  - 9. The computer-implemented method of claim 8, wherein the plurality of visual words are organized in an index comprising each visual word and corresponding database images.
  - 10. The computer-implemented method of claim 9, wherein the index is a Lucene index and the visual words of the plurality of database images are searched using a term frequency-inverse document frequency (tf-idf) weighting.
  - 11. The computer-implemented method of claim 6, further comprising:
    - compressing identifying information data for each feature descriptor, the identifying information including x location information, y location information, orientation information, and scale information for each feature descriptor.
  - 12. The computer-implemented method of claim 6, wherein the ranking is determined based at least in part on a respective number of query visual words matching a respective number of visual words.
  - 13. The computer-implemented method of claim 6, further comprising:
    - extracting features points from each of the database images using an Accumulated Signed Gradient (ASG) algorithm.

14. A computing system, comprising:
- a processor; and
  
  memory including instructions that, when executed by the processor, cause the computing system to;
  
  obtain images of a plurality of items;
  
  calculate a set of feature descriptors for each image, each of the set of feature descriptors being of a first data size;
  
  compress each feature descriptor from the first data size to a second data size by clustering the set of feature descriptors into cluster centers;
  
  assign one or more visual words to the cluster centers, the one or more visual words corresponding to one or more feature descriptors associated with a respective compressed cluster center;
  
  store each compressed feature descriptor at the second size in a database, the first data size being larger than the second data size;
  
  index the visual words in an index comprising the visual words and corresponding images;
  
  upon receiving a query image from a client computing device, cause feature descriptors to be extracted from the query image;
  
  compare at least a portion of the extracted feature descriptors from the query image to a set of the cluster centers corresponding to a set of closest matching images to determine a matching score, the set of closest matching images determined based at least on a number of the one or more visual words matching at least the portion of the extracted feature descriptors;
  
  rank the set of closest matching images based on the matching score, the matching score determined using parameters from a machine learned process, the parameters produced by the machine learning process using one or more training images to at least partially compensate for inaccuracy caused by compressing the cluster centers; and
  
  select a highest ranking image of the set of closest matching database images as a match for the query image.
- View Dependent Claims (15, 16, 17, 18, 19)
- - 15. The computing system of claim 14, wherein the instructions, when executed by the processor, further enable the computing system to:
    - assign visual words to each extracted feature descriptor from the query image; and
      
      search, using the assigned visual words, a plurality of database visual words to identify the set of closest matching images that at least partially match the query image, each visual word corresponding to a feature of an image of a plurality of images.
  - 16. The computing system of claim 14, wherein compressing each feature descriptor includes removing bits from each feature descriptor using one of lossy compression or lossless compression.
  - 17. The computing system of claim 14, wherein the visual words are organized in an index comprising each visual word and corresponding database images.
  - 18. The computing system of claim 17, wherein the index is a Lucene index and the visual words of the images are searched using a term frequency-inverse document frequency (tf-idf) weighting.
  - 19. The computing system of claim 14, wherein the instructions, when executed by the processor, further enable the computing system to:
    - compress identifying information data for each feature descriptor, the identifying information data including x location information, y location information, orientation information, and scale information for each feature descriptor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
A9.com Incorporated (Amazon.com, Inc.)
Original Assignee
A9.com Incorporated (Amazon.com, Inc.)
Inventors
Ramesh, Sunil, Lin, Xiaofan, Dhua, Arnab Sanat Kumar, Taylor, Colin Jon, Dube, Simant, Pillai, Jaishanker K.
Primary Examiner(s)
Couso, Yon

Application Number

US14/133,252
Time in Patent Office

811 Days
Field of Search

382/243
US Class Current

1/1
CPC Class Codes

G06F 16/583   using metadata automaticall...

G06F 18/213   Feature extraction, e.g. by...

G06F 18/214   Generating training pattern...

G06F 18/22   Matching criteria, e.g. pro...

G06F 18/23   Clustering techniques

G06F 18/24   Classification techniques

G06F 18/2411   based on the proximity to a...

G06F 18/24147   Distances to closest patter...

G06F 18/2415   based on parametric or prob...

G06F 18/2433   Single-class perspective, e...

G06F 2218/08   Feature extraction

G06V 10/464   using a plurality of salien...

G06V 10/757   Matching configurations of ...

G06V 10/764   using classification, e.g. ...

G06V 2201/06   Recognition of objects for ...

H04N 19/426   using memory downsizing met...

H04N 19/90   using coding techniques not...

Scalable image matching

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

39 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Scalable image matching

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

39 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links