Image learning, automatic annotation, retrieval method, and device
First Claim
1. An image learning method comprising:
- performing a segmentation operation on a first image having annotations to segment the first image into one or more image regions;
extracting image feature vectors and text feature vectors from all the image regions to obtain an image feature matrix and a text feature matrix;
projecting the image feature matrix and the text feature matrix into a sub-space so as to maximize covariance between an image feature and a text feature, thereby obtaining the projected image feature matrix and the text feature matrix;
storing the projected image feature matrix and the text feature matrix;
establishing first links between the image regions based on the projected image feature matrix;
establishing second links between the first image and the image regions based on a result of the segmentation operation;
establishing third links between the first image and the annotations based on the first image having the annotations;
establishing fourth links between the annotations based on the projected text feature matrix;
calculating weights of all the links;
obtaining a graph showing a triangular relationship between the first image, the image regions, and the annotations based on all the links and the weights of the links corresponding to the links; and
providing a storage device and a processor, and wherein the step of storing the projected image feature and text feature matrices includes the step of storing information in the storage device, and wherein the step of calculating weights includes the step of using the processor.
1 Assignment
0 Petitions
Accused Products
Abstract
A first image having annotations is segmented into one or more image regions. Image feature vectors and text feature vectors are extracted from all the image regions to obtain an image feature matrix and a text feature matrix. The image feature matrix and the text feature matrix are projected into a sub-space to obtain the projected image feature matrix and the text feature matrix. The projected image feature matrix and the text feature matrix are stored. First links between the image regions, second links between the first image and the image regions, third links between the first image and the annotations, and fourth links between the annotations are established. Weights of all the links are calculated. A graph showing a triangular relationship between the first image, image regions, and annotations is obtained based on all the links and the weights of the links.
-
Citations
20 Claims
-
1. An image learning method comprising:
-
performing a segmentation operation on a first image having annotations to segment the first image into one or more image regions; extracting image feature vectors and text feature vectors from all the image regions to obtain an image feature matrix and a text feature matrix; projecting the image feature matrix and the text feature matrix into a sub-space so as to maximize covariance between an image feature and a text feature, thereby obtaining the projected image feature matrix and the text feature matrix; storing the projected image feature matrix and the text feature matrix; establishing first links between the image regions based on the projected image feature matrix; establishing second links between the first image and the image regions based on a result of the segmentation operation; establishing third links between the first image and the annotations based on the first image having the annotations; establishing fourth links between the annotations based on the projected text feature matrix; calculating weights of all the links; obtaining a graph showing a triangular relationship between the first image, the image regions, and the annotations based on all the links and the weights of the links corresponding to the links; and providing a storage device and a processor, and wherein the step of storing the projected image feature and text feature matrices includes the step of storing information in the storage device, and wherein the step of calculating weights includes the step of using the processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An image learning device comprising:
-
an image segmentation module that performs a segmentation operation on a first image having annotations to segment the first image into one or more image regions; a feature vector extraction module that extracts image feature vectors and text feature vectors from all the image regions to obtain an image feature matrix and a text feature matrix; a sub-space projection module that projects the image feature matrix and the text feature matrix into a sub-space so as to maximize covariance between an image feature and a text feature, thereby obtaining the projected image feature matrix and the text feature matrix; a storage device, and a storage module that stores the projected image feature matrix and the text feature matrix; and a processor, and a graph building module that establishes first links between the image regions based on the projected image feature matrix;
establishes second links between the first image and the image regions based on a result of the segmentation operation;
establishes third links between the first image and the annotations based on the first image having the annotations;
establishes fourth links between the annotations based on the projected text feature matrix;
calculates weights of all the links; and
obtains a graph showing a triangular relationship between the first image, the image regions, and the annotations based on all the links and the weights of the links corresponding to the links. - View Dependent Claims (16, 17, 18)
-
-
19. An image retrieval method based on an input second image, the image retrieval method comprising a learning step, a preliminary processing step, a graph update step, and a retrieval step,
wherein the learning step includes: - performing a segmentation operation on a first image having annotations to segment the first image into one or more image regions;
extracting image feature vectors and text feature vectors from all the image regions to obtain an image feature matrix and a text feature matrix;
projecting the image feature matrix and the text feature matrix into a sub-space so as to maximize covariance between an image feature and a text feature, thereby obtaining the projected image feature matrix and the text feature matrix;
providing a storage device, and storing the projected image feature matrix and the text feature matrix, wherein the step of storing the projected image feature matrix and the text feature matrix occurs subsequent to the step of providing the storage device;
establishing first links between the image regions based on the projected image feature matrix;
establishing second links between the first image and the image regions based on a result of the segmentation operation;
establishing third links between the first image and the annotations based on the first image having the annotations;
establishing fourth links between the annotations based on the projected text feature matrix;
calculating weights of all the links; and
obtaining a graph showing a triangular relationship between the first image, the image regions, and the annotations based on all the links and the weights of the links corresponding to the links,the preliminary processing step includes;
receiving the second image;
performing the segmentation operation on the second image to segment the second image into one or more image regions;
extracting image feature vectors from all the image regions to obtain an image feature matrix of the second image; and
projecting the image feature matrix into the sub-space to obtain a projected image feature matrix of the second image,the graph update step includes;
establishing fifth links between the image region nodes of the second image and the image region nodes in the graph based on the projected first image feature matrix and the second image feature matrix;
establishing sixth links between the second image and the image region nodes based on a result of the segmentation operation;
determining weights of the links of the fifth links and the sixth links; and
updating the graph based on the fifth links and the sixth links and the weights of the links corresponding to the fifth links and the sixth links, andthe retrieval step includes;
generating a restart vector corresponding to the second image; and
acquiring and outputting a predetermined number of annotations most closely related to the second image with a random walk with restart. - View Dependent Claims (20)
- performing a segmentation operation on a first image having annotations to segment the first image into one or more image regions;
Specification