SCALABLE ATTRIBUTE-DRIVEN IMAGE RETRIEVAL AND RE-RANKING
First Claim
1. A method for building a database of searchable images comprising:
- extracting visual features from regions of multiple images in a first set of labeled images and applying the labels to the extracted visual features, wherein the regions from which visual features are extracted have salient visual characteristics;
learning a transformation that uses the labels of the labeled visual features to transform the visual features into a discrimination vector, wherein the transformation is learned such that the discrimination vector discriminates between the labels;
extracting visual features from multiple images in a second set of images different from the labeled images, wherein the regions from which the visual features are extracted have salient visual characteristics;
applying the learned transformation to the visual features extracted from the second set of images so as to transform the visual features into respective discrimination vectors for each image in the second set of images; and
storing the labeled images in the first set of images and the images from the second set of images in a database in association with the respective discrimination vectors for each such image, wherein each such image is stored for retrieval by a search which at least in part uses the associated discrimination vectors.
1 Assignment
0 Petitions
Accused Products
Abstract
Retrieval of images of objects from a large-scale database of object images, based on a query image. The database may, for example, contain images of objects such as faces, vehicles, people and luggage. Semantic attributes such as doors or windows in the case of vehicles are used as high level semantic cues to determine identities of objects in the images. Salient visual characteristics of the images are labeled with attribute information, and a transformation is learned so as to transform the labeled visual characteristics into a discrimination vector that discriminates between the labels. A similarity metric is learned using the discrimination vectors, such that different images depicting the same object are determined to be close while those having different objects are determined to be far apart. Candidates are retrieved based on a query image, and a re-ranking step may be applied to improve results. Validation experiments are described.
-
Citations
49 Claims
-
1. A method for building a database of searchable images comprising:
-
extracting visual features from regions of multiple images in a first set of labeled images and applying the labels to the extracted visual features, wherein the regions from which visual features are extracted have salient visual characteristics; learning a transformation that uses the labels of the labeled visual features to transform the visual features into a discrimination vector, wherein the transformation is learned such that the discrimination vector discriminates between the labels; extracting visual features from multiple images in a second set of images different from the labeled images, wherein the regions from which the visual features are extracted have salient visual characteristics; applying the learned transformation to the visual features extracted from the second set of images so as to transform the visual features into respective discrimination vectors for each image in the second set of images; and storing the labeled images in the first set of images and the images from the second set of images in a database in association with the respective discrimination vectors for each such image, wherein each such image is stored for retrieval by a search which at least in part uses the associated discrimination vectors. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for retrieval of images from a searchable database based on similarity to a query image, comprising:
-
extracting visual features from regions of the query image, wherein the regions from which the visual features are extracted have salient visual characteristics; applying a learned transformation to the extracted visual features so as to transform the extracted visual features into a discrimination vector, wherein the learned transformation is learned by using labels of a labeled database of visual features to learn a transformation of the visual features into discrimination vectors that discriminate between the labels; generating an image similarity measure between the discrimination vector for the query image and a discrimination vector for multiple images in the searchable database, wherein the similarity measure is generated using a calculation learned from a database of multiple images labeled with identities of labelable objects represented in the multiple images, and wherein the calculation measures whether the objects represented in the images are the same objects or are different objects; and obtaining a candidate list of images in the searchable database that are similar to the query image based at least in part on the similarity measure. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A method for comparing objects in images, the method comprising:
-
obtaining a plurality of respective low-level features and a plurality of respective attribute scores for a plurality of reference object images; generating a refined low-level feature transformation based at least in part on the plurality of respective low-level features from more than one region of the object and the plurality of respective attribute scores; and generating an object-similarity measure of a first object image and a second object image based at least in part on low-level features of the first object image, on low-level features of the second object image, and on the refined low-level feature transformation. - View Dependent Claims (31, 32)
-
-
33. A method for retrieval of objects in images, the method comprising:
-
obtaining a plurality of respective low-level features and a plurality of respective attribute scores for a plurality of reference object images; generating a refined low-level feature transformation based at least in part on the plurality of respective low-level features from more than one region of the object and the plurality of respective attribute scores; generating an object-similarity measure of a first object image and a second object image based at least in part on a low-level features of the first object image, a low-level features of the second object image, and the refined low-level feature transformation; retrieving a subset of images from a plurality of images wherein the subset of images are retrieved based on the respective object-similarity measures of a third object image and a plurality of fourth object images; and ranking the subset of images based at least in part on the respective object-similarity measure of the third object image and one or more of the subset images and based at least in part on a low-level feature similarity of the third object image and the one or more of the subset images. - View Dependent Claims (34, 35)
-
-
36. A method for creating an attribute similarity metric, comprising:
-
receiving a plurality of identified object images wherein identities of the object images are such that a least some of the images of the same object are labeled with the same identifier; receiving a plurality of attributes describing object images associated with an identifier; extracting a plurality of respective low-level region features from a plurality of regions of the images; learning an attribute subspace mapping based at least in part on the plurality of the respective low-level region features from the plurality of image regions and on the attribute labels for a plurality of attributes; mapping the plurality of low-level region features to a respective plurality of subspace region features based at least in part on the attribute subspace mapping; and creating an attribute similarity measure based at least in part on the plurality of subspace region features and on the identifier of the individual object in the image. - View Dependent Claims (37, 38, 39, 40, 41)
-
-
42. A method for retrieval of images from a large-scale database of images based on a query image, comprising:
-
accessing a low level feature transformation, a low dimensional projection into a semantic attribute subspace, and a distance metric; applying the low level feature transformation to the query image so as to extract low level features representative of the query image; obtaining a candidate set of images from the large-scale database of images based at least in part on similarity of the low level features for the query image to low level features of the images in the large-scale database of images; applying the low dimensional projection to the query image so as to obtain a semantic attribute projection of the query image; and ranking the candidate images based at least in part on similarity of the semantic attribute projection for the query image to semantic attribute projections of the images in the large-scale database of images so as to result in a ranked retrieval of images, wherein similarity of the semantic attribute projection is measured by the distance metric. - View Dependent Claims (43, 44, 45, 46, 47, 48, 49)
-
Specification