Interdependent learning of template map and similarity metric for object identification
First Claim
1. A computer-implemented method of generating a template map and a similarity metric used to determine a degree of visual similarity of two digital objects, comprising:
- storing a set of raw object templates, each raw object template representing image features derived from an object within a digital image of a corpus;
iteratively performing an incremental learning process comprising;
at each iteration, adding a map component to a template map that transforms a raw object template to a reduced object template, the reduced object template being stored in less memory than the raw object template;
at each iteration, adding a metric component to a similarity metric that accepts as input two reduced object templates produced by the template map from raw object templates and produces as output a similarity score representing visual similarity of the objects represented by the two reduced object templates;
wherein the metric component is added to the similarity metric based at least in part on values of the template map, and the map component is added to the template map based at least in part on values of the similarity metric; and
storing the template map and the similarity metric.
2 Assignments
0 Petitions
Accused Products
Abstract
An object identification system iteratively learns both a template map used to transform a template describing an object in an image, and a related similarity metric used in comparing one transformed object template to another. This automatic learning eliminates the need to manually devise a transformation and metric that are effective for a given image corpus. The template map and the similarity metric are learned together, such that the incremental component to be added to the template map at a given iteration of the learning process is based at least in part on the components of the similarity metric, and vice-versa.
-
Citations
21 Claims
-
1. A computer-implemented method of generating a template map and a similarity metric used to determine a degree of visual similarity of two digital objects, comprising:
-
storing a set of raw object templates, each raw object template representing image features derived from an object within a digital image of a corpus; iteratively performing an incremental learning process comprising; at each iteration, adding a map component to a template map that transforms a raw object template to a reduced object template, the reduced object template being stored in less memory than the raw object template; at each iteration, adding a metric component to a similarity metric that accepts as input two reduced object templates produced by the template map from raw object templates and produces as output a similarity score representing visual similarity of the objects represented by the two reduced object templates; wherein the metric component is added to the similarity metric based at least in part on values of the template map, and the map component is added to the template map based at least in part on values of the similarity metric; and storing the template map and the similarity metric. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An object identification system for learning how to identify objects in a corpus of digital images, comprising:
-
a data repository comprising a set of raw object templates, each raw object template representing image features derived from an object within a digital image of the corpus; and a learning module configured to; iteratively perform an incremental learning process comprising; at each iteration, adding a map component to a template map that transforms a raw object template to a reduced object template, the reduced object template being stored in less memory than the raw object template; at each iteration, adding a metric component to a similarity metric that accepts as input two reduced object templates produced by the template map from raw object templates and produces as output a similarity score representing visual similarity of the objects represented by the two reduced object templates; wherein the metric component is added to the similarity metric based at least in part on values of the template map, and the map component is added to the template map based at least in part on values of the similarity metric; and store the template map and the similarity metric in the data repository. - View Dependent Claims (14, 15, 16)
-
-
17. A non-transitory computer readable storage medium storing a computer program executable by a processor for learning how to identify objects in a corpus of digital images, the action of the computer program comprising:
-
storing a set of raw object templates, each raw object template representing image features derived from an object within a digital image of the corpus; iteratively performing an incremental learning process comprising; at each iteration, adding a map component to a template map that transforms a raw object template to a reduced object template, the reduced object template being stored in less memory than the raw object template; at each iteration, adding a metric component to a similarity metric that accepts as input two reduced object templates produced by the template map from raw object templates and produces as output a similarity score representing visual similarity of the objects represented by the two reduced object templates; wherein the metric component is added to the similarity metric based at least in part on values of the template map, and the map component is added to the template map based at least in part on values of the similarity metric; and storing the template map and the similarity metric. - View Dependent Claims (18, 19)
-
-
20. A computer-implemented method of generating a template map and a similarity metric for determining a measure of visual similarity of two images, the method comprising:
-
initializing a template map for producing, from a plurality of real value features derived from the image, a reduced plurality of features that includes a plurality of integer value features representing the image; initializing a similarity metric that compares the reduced plurality of features of a first image to the reduced plurality of features of a second image; updating the template map and the similarity metric by iteratively performing the steps of; selecting a map component from a group of candidate map components; selecting a metric component from a group of candidate metric components; comparing a first image and a second image using the template map, the selected map component, the similarity metric, and the selected metric component; generating a measure of recognition accuracy for the comparison; responsive to the measure of recognition accuracy having at least a threshold value, selectively adding the map component to the template map and selectively adding the selected metric component to the similarity metric; and storing the generated template map and the generated similarity metric.
-
-
21. A computer-implemented method of learning to compute degrees of similarity between human faces in digital images, comprising:
-
storing a plurality of training images containing a human face; from each training image, associating with the training image a raw face template representing the face in the training image, thereby producing training face templates, the raw face template being a vector having n elements, for some integer n; initializing the following to empty; a linear map matrix adapted to store row vectors having n elements, a quantizer function adapted to map a scalar value to an index integer of one of a set of ranges, and a similarity matrix set adapted to store a plurality of matrices each having q rows and q columns, for some integer q; iteratively performing the following operations; specifying a plurality of candidate vectors having n elements, and a plurality of candidate matrices having q rows and q columns; for a candidate vector of the selected plurality of candidate vectors and a candidate matrix of the selected plurality of candidate matrices; computing a loss value quantifying face identification inaccuracy based at least in part on the candidate vector, the candidate matrix, the training face templates, the linear map matrix, the quantizer function, and the similarity matrix set; determining that the candidate vector and the candidate matrix produce a lowest loss value with respect to others of the candidate vectors and candidate matrices; appending the candidate vector to the linear map matrix; appending the candidate matrix to the similarity matrix set; terminating the iterative performing of the operations responsive to one or more of; having performed a number of iterations exceeding a threshold number of iterations, and the computed loss value not being a threshold amount lower than a loss value from a directly preceding iteration; and storing the linear map matrix, the quantizer function, and the similarity matrix set.
-
Specification