Systems and methods for generating a high-level visual vocabulary

US 9,342,991 B2
Filed: 03/14/2013
Issued: 05/17/2016
Est. Priority Date: 03/14/2013
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

generating inter-visual-word relationships between a plurality of visual words based on visual word-label relationships, wherein the visual word-label relationships are based on co-occurrences of respective visual words and labels in one or more images, and wherein the inter-visual word relationships are based on scores between the visual word-label relationships of respective visual words;

mapping the visual words to a vector space based on the inter-visual-word relationships; and

generating high-level visual words in the vector space.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for learning a high-level visual vocabulary generate inter-visual-word relationships between a plurality of visual words based on visual word-label relationships, map the visual words to a vector space based on the inter-visual word relationships, and generate high-level visual words in the vector space.

Citations

13 Claims

1. A method comprising:
- generating inter-visual-word relationships between a plurality of visual words based on visual word-label relationships, wherein the visual word-label relationships are based on co-occurrences of respective visual words and labels in one or more images, and wherein the inter-visual word relationships are based on scores between the visual word-label relationships of respective visual words;
  
  mapping the visual words to a vector space based on the inter-visual-word relationships; and
  
  generating high-level visual words in the vector space.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein respective distances between the visual words in the vector space indicate the inter-visual-word relationships.
  - 3. The method of claim 1, wherein the inter-visual-word relationships are further based on visual word-image relationships and label-image relationships.
  - 4. The method of claim 3, wherein the label-image relationships are one-to-one relationships.
  - 5. The method of claim 1, wherein the scores between the visual word-label relationships of respective visual words are based on respective averages of the KL divergence between each pair of mid-level features according to
    S(F_i,F_j)=S(F_j,F_i)=½
    - [KL(P_i|P_j)+KL(P_j|P_i)],where
  - 6. The method of claim 1, further comprising generating clusters of visual words in the vector space based on respective positions of the visual words in the vector space, wherein the high-level visual words are generated based on the clusters of visual words.
  - 7. The method of claim 1, wherein the vector space is generated via a diffusion map.

8. One or more non-transitory computer-readable media storing instructions that, when executed by one or more computing devices, cause the one or more computing devices to perform operations comprising:
- generating inter-visual-word relationships between a plurality of visual words based on visual word-label relationships, wherein the visual word-label relationships are based on co-occurrences of respective visual words and labels in one or more images, and wherein the inter-visual word relationships are based on scores between the visual word-label relationships of respective visual words;
  
  mapping the visual words to a vector space based on the inter-visual-word relationships; and
  
  generating high-level visual words in the vector space based on respective positions of the visual words in the vector space.
- View Dependent Claims (9, 10, 11)
- - 9. The one or more computer-readable media of claim 8, wherein the inter-visual-word relationships are represented as distances between the respective visual words in the vector space.
  - 10. The one or more computer-readable media of claim 8, wherein mapping the visual words to the vector space includes generating a weight matrix.
  - 11. The one or more computer-readable media of claim 8, wherein the high-level visual words encode features via soft cluster assignments.

12. A method comprising:
- generating inter-visual-word relationships between a plurality of visual words;
  
  generating sets of importance weights for the visual words, wherein a set of importance weights includes a respective weight for each of the visual words;
  
  mapping the visual words to one or more vector spaces based on the inter-visual word relationships and on the sets of importance weights, wherein each vector space corresponds to a respective one of the sets of importance weights;
  
  generating high-level visual words in the plurality of vector spaces;
  
  assigning a respective importance score to each of the high-level visual words; and
  
  selecting high-level visual words based on their respective importance scores.

13. A system comprising:
- one or more computer-readable media; and
  
  one or more processors that are coupled to the one or more computer-readable media and that are configured to cause the system to generate inter-visual-word relationships between a plurality of visual words based on visual word-label relationships, wherein the visual word-label relationships are based on co-occurrences of respective visual words and labels in one or more images, and wherein the inter-visual word relationships are based on scores between the visual word-label relationships of respective visual words, map the visual words to a vector space based on the inter-visual-word relationships, and generate high-level visual words in the vector space.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Canon Ayutthaya Limited (Canon Inc.)
Original Assignee
Canon Ayutthaya Limited (Canon Inc.)
Inventors
Yang, Yang, Denney, Bradley Scott, Lu, Juwei, Dusberger, Dariusz, Huang, Hung Khei
Primary Examiner(s)
STREGE, JOHN B

Application Number

US13/830,247
Publication Number

US 20140272822A1
Time in Patent Office

1,160 Days
Field of Search

382/187
US Class Current

1/1
CPC Class Codes

G09B 19/00 Teaching not covered by oth...

G09B 5/02 with visual presentation of...

Systems and methods for generating a high-level visual vocabulary

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for generating a high-level visual vocabulary

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links