Systems and methods for creating a visual vocabulary

US 8,977,041 B2
Filed: 08/22/2012
Issued: 03/10/2015
Est. Priority Date: 08/22/2012
Status: Active Grant

First Claim

Patent Images

1. A method for building a visual vocabulary, the method comprising:

generating visual words based on a set of features, wherein the visual words are defined in a higher-dimensional space;

projecting the visual words from the higher-dimensional space to a first lower-dimensional space, thereby producing projections of the visual words in the first lower-dimensional space;

generating a first collection of buckets in the first lower-dimensional space based on the projections of the visual words in the first lower-dimensional space;

projecting the visual words from the higher-dimensional space to a second lower-dimensional space, thereby producing projections of the visual words in the second lower-dimensional space;

generating a second collection of buckets in the second lower-dimensional space based on the projections of the visual words in the second lower-dimensional space; and

iteratively selecting a sub-collection of buckets from the first collection of buckets and from the second collection of buckets, wherein bucket selection during any iteration after an initial iteration is based at least in part on feedback from previously selected buckets.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for generating a visual vocabulary build a plurality of visual words via unsupervised learning on set of features of a given type; decompose one or more visual words to a collection of lower-dimensional buckets; generate labeled image representations based on the collection of lower dimensional buckets and labeled images, wherein labels associated with an image are associated with a respective representation of the image; and iteratively select a sub-collection of buckets from the collection of lower-dimensional buckets based on the labeled image representations, wherein bucket selection during any iteration after an initial iteration is based at least in part on feedback from previously selected buckets.

Citations

18 Claims

1. A method for building a visual vocabulary, the method comprising:
- generating visual words based on a set of features, wherein the visual words are defined in a higher-dimensional space;
  
  projecting the visual words from the higher-dimensional space to a first lower-dimensional space, thereby producing projections of the visual words in the first lower-dimensional space;
  
  generating a first collection of buckets in the first lower-dimensional space based on the projections of the visual words in the first lower-dimensional space;
  
  projecting the visual words from the higher-dimensional space to a second lower-dimensional space, thereby producing projections of the visual words in the second lower-dimensional space;
  
  generating a second collection of buckets in the second lower-dimensional space based on the projections of the visual words in the second lower-dimensional space; and
  
  iteratively selecting a sub-collection of buckets from the first collection of buckets and from the second collection of buckets, wherein bucket selection during any iteration after an initial iteration is based at least in part on feedback from previously selected buckets.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, further comprising combining the buckets of the sub-collection of buckets to build a visual vocabulary.
  - 3. The method of claim 1, wherein prior probabilities about types of features in the set of features are used to guide the selecting of the sub-collection of buckets.
  - 4. The method of claim 1, wherein the visual words are generated using K-means clustering, and each cluster represents a visual word.
  - 5. The method of claim 1, wherein the generating of the first collection of buckets in the first lower-dimensional space is further based on a purity measure.
  - 6. The method of claim 1, further comprising generating labeled image representations based on the first collection of buckets, on the second collection of buckets, and on labeled images, wherein labels that are associated with an image are associated with a respective labeled image representation of the image.
  - 7. The method of claim 6, wherein iteratively selecting the sub-collection of buckets from the first collection of buckets and from the second collection of buckets further includes iteratively selecting buckets that are most discriminative of the labels based on the labeled image representations, and wherein the method further comprises training respective classifiers for the labels based on the selected buckets that are most discriminative of the labels.
  - 8. The method of claim 7, wherein the sub-collection of buckets is selected with AdaBoost learning.

9. One or more non-transitory computer-readable media storing instructions that, when executed by one or more computing devices, cause the computing devices to perform operations comprising:
- clustering features from one or more images to form feature clusters, wherein the features and the feature clusters are defined in a higher-dimensional space;
  
  projecting the feature clusters from the higher-dimensional space to a first lower-dimensional space to form projections of the feature clusters in the first lower-dimensional space;
  
  projecting the feature clusters from the higher-dimensional space to a second lower-dimensional space to form projections of the feature clusters in the second lower-dimensional space; and
  
  generating buckets in the first lower-dimensional space and in the second lower-dimensional space based on the projections of the feature clusters in the first lower-dimensional space and on the projections of the feature clusters in the second lower-dimensional space.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The one or more non-transitory computer readable media of claim 9, wherein the operations further comprise generating respective classifiers for one or more labels based on the buckets in the first lower-dimensional space and in the second lower-dimensional space, wherein a classifier maps a feature to a label.
  - 11. The one or more non-transitory computer readable media of claim 10, wherein the operations further comprise:
    - selecting optimal buckets that are associated with a first label; and
      
      training classifiers for the first label based on the selected optimal buckets that are associated with the first label.
  - 12. The one or more non-transitory computer readable media of claim 11, wherein the operations further comprise:
    - selecting optimal buckets that are associated with a second label; and
      
      training classifiers for the second label based on the selected optimal buckets that are associated with the second label.
  - 13. The one or more non-transitory computer readable media of claim 12, wherein the operations further comprise adding the classifiers for the first label and the classifiers for the second label to a visual vocabulary.
  - 14. The one or more non-transitory computer readable media of claim 9, wherein the buckets are generated further based on a purity measure.
  - 15. The one or more non-transitory computer readable media of claim 9, wherein a bucket is associated with a respective bucket function

16. A device for building a visual vocabulary, the device comprising:
- a computer memory; and
  
  one or more processors that are coupled to the computer memory and that are configured to cause the device togenerate visual words based on a plurality of features, wherein the visual words are defined in a higher-dimensional space,project the visual words from the higher-dimensional space to a first lower-dimensional space, thereby producing projections of the visual words in the first lower-dimensional space,generate a first collection of buckets in the first lower-dimensional space based on the projections of the visual words in the first lower-dimensional space,project the visual words from the higher-dimensional space to a second lower-dimensional space, thereby producing projections of the visual words in the second lower-dimensional space,generate a second collection of buckets in the second lower-dimensional space based on the projections of the visual words in the second lower-dimensional space, anditeratively select a sub-collection of buckets from the first collection of buckets and from the second collection of buckets, wherein bucket selection during any iteration after an initial iteration is based at least in part on feedback from previously selected buckets.
- View Dependent Claims (17, 18)
- - 17. The device of claim 16, wherein the one or more processors are further configured to cause the device to generate labeled image representations based on the first collection of buckets, on the second collection of buckets, and on labeled images, wherein labels that are associated with an image are associated with a respective labeled image representation of the image.
  - 18. The device of claim 17, wherein, to iteratively select the sub-collection of buckets from the first collection of buckets and from the second collection of buckets, the one or more processors are further configured to cause the device to iteratively select buckets that are most discriminative of the labels based on the labeled image representations, andwherein the one or more processors are further configured to cause the device to train respective classifiers for the labels based on the selected buckets that are most discriminative of the labels.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Original Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Inventors
Lu, Juwei, Denney, Bradley Scott
Primary Examiner(s)
Li, Ruiping

Application Number

US13/592,148
Publication Number

US 20140056511A1
Time in Patent Office

930 Days
Field of Search

382/159, 382224-225, 382/190, 382/168
US Class Current

382/159
CPC Class Codes

G06F 18/214   Generating training pattern...

G06V 10/464   using a plurality of salien...

G06V 10/774   Generating sets of training...

Systems and methods for creating a visual vocabulary

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for creating a visual vocabulary

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links