Image learning, automatic annotation, retrieval method, and device

US 8,232,996 B2
Filed: 05/19/2009
Issued: 07/31/2012
Est. Priority Date: 05/20/2008
Status: Expired due to Fees

First Claim

Patent Images

1. An image learning method comprising:

performing a segmentation operation on a first image having annotations to segment the first image into one or more image regions;

extracting image feature vectors and text feature vectors from all the image regions to obtain an image feature matrix and a text feature matrix;

projecting the image feature matrix and the text feature matrix into a sub-space so as to maximize covariance between an image feature and a text feature, thereby obtaining the projected image feature matrix and the text feature matrix;

storing the projected image feature matrix and the text feature matrix;

establishing first links between the image regions based on the projected image feature matrix;

establishing second links between the first image and the image regions based on a result of the segmentation operation;

establishing third links between the first image and the annotations based on the first image having the annotations;

establishing fourth links between the annotations based on the projected text feature matrix;

calculating weights of all the links;

obtaining a graph showing a triangular relationship between the first image, the image regions, and the annotations based on all the links and the weights of the links corresponding to the links; and

providing a storage device and a processor, and wherein the step of storing the projected image feature and text feature matrices includes the step of storing information in the storage device, and wherein the step of calculating weights includes the step of using the processor.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A first image having annotations is segmented into one or more image regions. Image feature vectors and text feature vectors are extracted from all the image regions to obtain an image feature matrix and a text feature matrix. The image feature matrix and the text feature matrix are projected into a sub-space to obtain the projected image feature matrix and the text feature matrix. The projected image feature matrix and the text feature matrix are stored. First links between the image regions, second links between the first image and the image regions, third links between the first image and the annotations, and fourth links between the annotations are established. Weights of all the links are calculated. A graph showing a triangular relationship between the first image, image regions, and annotations is obtained based on all the links and the weights of the links.

Citations

20 Claims

1. An image learning method comprising:
- performing a segmentation operation on a first image having annotations to segment the first image into one or more image regions;
  
  extracting image feature vectors and text feature vectors from all the image regions to obtain an image feature matrix and a text feature matrix;
  
  projecting the image feature matrix and the text feature matrix into a sub-space so as to maximize covariance between an image feature and a text feature, thereby obtaining the projected image feature matrix and the text feature matrix;
  
  storing the projected image feature matrix and the text feature matrix;
  
  establishing first links between the image regions based on the projected image feature matrix;
  
  establishing second links between the first image and the image regions based on a result of the segmentation operation;
  
  establishing third links between the first image and the annotations based on the first image having the annotations;
  
  establishing fourth links between the annotations based on the projected text feature matrix;
  
  calculating weights of all the links;
  
  obtaining a graph showing a triangular relationship between the first image, the image regions, and the annotations based on all the links and the weights of the links corresponding to the links; and
  
  providing a storage device and a processor, and wherein the step of storing the projected image feature and text feature matrices includes the step of storing information in the storage device, and wherein the step of calculating weights includes the step of using the processor.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The image learning method according to claim 1, wherein, in the segmentation operation, the first image is segmented into rectangular blocks, multi-resolution quad-tree sub-blocks, or non-overlapping homogeneous regions with an image segmentation algorithm.
  - 3. The image learning method according to claim 1, wherein the image feature vectors of all the image regions are extracted by an algorithm based on a local binary pattern feature comprising mixed colors and pattern information.
  - 4. The image learning method according to claim 1, wherein the sub-space is a canonical covariance sub-space.
  - 5. The image learning method according to claim 1, wherein the first image, the image regions, and the annotations are represented by nodes in the graph, the graph is represented by an adjacency matrix, the links between the nodes are represented by the weights of the links in the graph, and a value of a corresponding weight represents 0 if there is no link between the nodes.
  - 6. The image learning method according to claim 5, wherein, in the first link, the larger a difference in image between the image region nodes corresponding to a sub-link is, the smaller weight value of the sub-link becomes, and/or in the third link, the larger the number of appearances of the annotation node corresponding to the sub-link is, the smaller the weight value of the sub-link becomes, and/or in the fourth link, the larger similarity in text between the annotation nodes corresponding to the sub-link is, the smaller the weight value of the sub-link becomes.
  - 7. An image automatic annotation method for making an annotation on an input second image, the image automatic annotation method comprising the learning method according to claim 1, a preliminary processing step, a graph update step, and an annotation step,wherein the preliminary processing step includes:
    - receiving the second image;
      
      performing the segmentation operation on the second image to segment the second image into one or more image regions;
      
      extracting image feature vectors from all the image regions to obtain an image feature matrix of the second image; and
      
      projecting the image feature matrix into the sub-space to obtain a projected image feature matrix of the second image,the graph update step includes;
      
      establishing fifth links between the image region nodes of the second image and the image region nodes in the graph based on the projected first image feature matrix and the second image feature matrix;
      
      establishing sixth links between the second image and the image region nodes based on a result of the segmentation operation;
      
      determining weights of the links of the fifth links and the sixth links; and
      
      updating the graph based on the fifth links and the sixth links and the weights of the links corresponding to the fifth links and the sixth links, andthe annotation step includes;
      
      generating a restart vector corresponding to the second image;
      
      acquiring a predetermined number of annotations most closely related to the second image with a random walk with restart; and
      
      making the annotations on the second image using keywords corresponding to the predetermined number of annotations.
  - 8. The image automatic annotation method according to claim 7, wherein, in the segmentation operation, the first image and the second image are segmented into rectangular blocks, multi-resolution quad-tree sub-blocks, or non-overlapping homogeneous regions with an image segmentation algorithm.
  - 9. The image automatic annotation method according to claim 7, wherein the image feature vectors of all the image regions are extracted by an algorithm based on a local binary pattern feature comprising mixed colors and pattern information.
  - 10. The image automatic annotation method according to claim 7, wherein the sub-space is a canonical covariance sub-space.
  - 11. The image automatic annotation method according to claim 7, wherein the first image, the second image, the image regions, and the annotations are represented by nodes in the graph, the graph is represented by an adjacency matrix, the links between the nodes are represented by the weights of the links in the graph, and a value of a corresponding weight represents 0 if there is no link between the nodes.
  - 12. The image automatic annotation method according to claim 11, wherein, in the first link, the larger a difference in image between the image region nodes corresponding to a sub-link is, the smaller weight value of the sub-link becomes, and/or in the third link, the larger the number of appearances of the annotation nodes corresponding to the sub-link is, the smaller the weight value of the sub-link becomes, and/or in the fourth link, the larger similarity in text between the annotation nodes corresponding to the sub-link is, the smaller the weight value of the sub-links becomes.
  - 13. The image automatic annotation method according to claim 11, further comprising:
    - a step of applying normalization to the updated adjacency matrix before performing the annotation step based on the adjacency matrix subjected to the normalization.
  - 14. An image retrieval method for retrieving an image when a retrieval keyword is input, the image retrieval method comprising the learning method according to claim 1 and a retrieval step, wherein the retrieval step includes:
    - generating a restart vector corresponding to the retrieval keyword; and
      
      acquiring and outputting a predetermined number of images most closely related to the retrieval keyword with a random walk with restart.

15. An image learning device comprising:
- an image segmentation module that performs a segmentation operation on a first image having annotations to segment the first image into one or more image regions;
  
  a feature vector extraction module that extracts image feature vectors and text feature vectors from all the image regions to obtain an image feature matrix and a text feature matrix;
  
  a sub-space projection module that projects the image feature matrix and the text feature matrix into a sub-space so as to maximize covariance between an image feature and a text feature, thereby obtaining the projected image feature matrix and the text feature matrix;
  
  a storage device, and a storage module that stores the projected image feature matrix and the text feature matrix; and
  
  a processor, and a graph building module that establishes first links between the image regions based on the projected image feature matrix;
  
  establishes second links between the first image and the image regions based on a result of the segmentation operation;
  
  establishes third links between the first image and the annotations based on the first image having the annotations;
  
  establishes fourth links between the annotations based on the projected text feature matrix;
  
  calculates weights of all the links; and
  
  obtains a graph showing a triangular relationship between the first image, the image regions, and the annotations based on all the links and the weights of the links corresponding to the links.
- View Dependent Claims (16, 17, 18)
- - 16. An image automatic annotation device for making an annotation on an input second image, the image automatic annotation device comprising the image learning device according to claim 15, a preliminary processing module, a graph update module, and an annotation module,wherein the preliminary processing module includes:
    - a unit that receives the second image;
      
      a unit that performs the segmentation operation on the second image to segment the second image into one or more image regions;
      
      a unit that extracts image feature vectors from all the image regions to obtain an image feature matrix of the second image; and
      
      a unit that projects the image feature matrix of the second image into the sub-space to obtain a projected image feature matrix of the second image,the graph update module includes;
      
      a unit that establishes fifth links between the image region nodes of the second image and the image region nodes in the graph based on the projected first image feature matrix and the second image feature matrix and establishes sixth links between the second image and the image region nodes based on a result of the segmentation operation;
      
      a unit that determines weights of the links of the fifth links and the sixth links; and
      
      a unit that updates the graph based on the fifth links and the sixth links and the weights of the links corresponding to the fifth links and the sixth links, andthe annotation module includes;
      
      a unit that generates a restart vector corresponding to the second image and acquires a predetermined number of annotations most closely related to the second image with a random walk with restart; and
      
      a unit that makes the annotations on the second image using keywords corresponding to the predetermined number of annotations.
  - 17. An image retrieval device for retrieving an input second image, the image retrieval device comprising the learning device, the preliminary processing module, the graph update module according to claim 16, and a retrieval module, wherein the retrieval module includes:
    - a unit that generates a restart vector corresponding to the second image and acquires a predetermined number of images most closely related to the second image with a random walk with restart.
  - 18. An image retrieval device used for retrieving an image when a retrieval keyword is input, the image retrieval device comprising the image learning device according to claim 15 and a retrieval module, wherein the retrieval module includes:
    - a unit that generates a restart vector corresponding to the retrieval keyword and acquires a predetermined number of images most closely related to the retrieval keyword with a random walk with restart.

19. An image retrieval method based on an input second image, the image retrieval method comprising a learning step, a preliminary processing step, a graph update step, and a retrieval step,wherein the learning step includes:
- performing a segmentation operation on a first image having annotations to segment the first image into one or more image regions;
  
  extracting image feature vectors and text feature vectors from all the image regions to obtain an image feature matrix and a text feature matrix;
  
  projecting the image feature matrix and the text feature matrix into a sub-space so as to maximize covariance between an image feature and a text feature, thereby obtaining the projected image feature matrix and the text feature matrix;
  
  providing a storage device, and storing the projected image feature matrix and the text feature matrix, wherein the step of storing the projected image feature matrix and the text feature matrix occurs subsequent to the step of providing the storage device;
  
  establishing first links between the image regions based on the projected image feature matrix;
  
  establishing second links between the first image and the image regions based on a result of the segmentation operation;
  
  establishing third links between the first image and the annotations based on the first image having the annotations;
  
  establishing fourth links between the annotations based on the projected text feature matrix;
  
  calculating weights of all the links; and
  
  obtaining a graph showing a triangular relationship between the first image, the image regions, and the annotations based on all the links and the weights of the links corresponding to the links,the preliminary processing step includes;
  
  receiving the second image;
  
  performing the segmentation operation on the second image to segment the second image into one or more image regions;
  
  extracting image feature vectors from all the image regions to obtain an image feature matrix of the second image; and
  
  projecting the image feature matrix into the sub-space to obtain a projected image feature matrix of the second image,the graph update step includes;
  
  establishing fifth links between the image region nodes of the second image and the image region nodes in the graph based on the projected first image feature matrix and the second image feature matrix;
  
  establishing sixth links between the second image and the image region nodes based on a result of the segmentation operation;
  
  determining weights of the links of the fifth links and the sixth links; and
  
  updating the graph based on the fifth links and the sixth links and the weights of the links corresponding to the fifth links and the sixth links, andthe retrieval step includes;
  
  generating a restart vector corresponding to the second image; and
  
  acquiring and outputting a predetermined number of annotations most closely related to the second image with a random walk with restart.
- View Dependent Claims (20)
- - 20. The image retrieval method according to claim 19, wherein a keyword is further input, and in the retrieval step, the restart vector corresponding to the second image and the keyword is generated and a predetermined number of images most closely related to the second image and the keyword are acquired and output based on the updated graph.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ricoh Company Limited
Original Assignee
Ricoh Company Limited
Inventors
Bailloeul, Timothee, Zhu, Caizhi, Xu, Yinghul
Primary Examiner(s)
WANG, JIN CHENG

Application Number

US12/468,423
Publication Number

US 20090289942A1
Time in Patent Office

1,169 Days
Field of Search

345440-442, 345419-427, 382/164, 707728-737
US Class Current

345/440
CPC Class Codes

G06F 16/58   Retrieval characterised by ...

G06F 16/5838   using colour

G06F 16/5846   using extracted text

G06F 16/587   using geographical or spati...

G06F 18/2135   based on approximation crit...

G06V 10/7715   Feature extraction, e.g. by...

Image learning, automatic annotation, retrieval method, and device

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Image learning, automatic annotation, retrieval method, and device

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links