Automatic large scale video object recognition
First Claim
1. A computer implemented method for generating a classification model of visual objects present in visual content items stored in a visual content repository, each visual content item having a textual description, the method comprising:
- for each of a plurality of object names, automatically selecting a plurality of visual content items from the visual content repository, extracting feature vectors from the visual content items, and performing a number of dimensionality reduction rounds on the feature vectors, each round producing reduced feature vectors as input for the next round, thereby producing multiple sets of reduced feature vectors for each object name;
for each object name, performing consistency learning on the sets of reduced feature vectors, until one of the sets of reduced feature vectors for the object name has a minimum measure of similarity to the other feature vectors associated with the object name; and
storing as the classification model for each object name, the set of reduced feature vectors which have the minimum measure of similarity.
2 Assignments
0 Petitions
Accused Products
Abstract
An object recognition system performs a number of rounds of dimensionality reduction and consistency learning on visual content items such as videos and still images, resulting in a set of feature vectors that accurately predict the presence of a visual object represented by a given object name within an visual content item. The feature vectors are stored in association with the object name which they represent and with an indication of the number of rounds of dimensionality reduction and consistency learning that produced them. The feature vectors and the indication can be used for various purposes, such as quickly determining a visual content item containing a visual representation of a given object name.
88 Citations
43 Claims
-
1. A computer implemented method for generating a classification model of visual objects present in visual content items stored in a visual content repository, each visual content item having a textual description, the method comprising:
-
for each of a plurality of object names, automatically selecting a plurality of visual content items from the visual content repository, extracting feature vectors from the visual content items, and performing a number of dimensionality reduction rounds on the feature vectors, each round producing reduced feature vectors as input for the next round, thereby producing multiple sets of reduced feature vectors for each object name; for each object name, performing consistency learning on the sets of reduced feature vectors, until one of the sets of reduced feature vectors for the object name has a minimum measure of similarity to the other feature vectors associated with the object name; and storing as the classification model for each object name, the set of reduced feature vectors which have the minimum measure of similarity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An object recognition system for generating a classification model for recognizing a visual object, the system comprising:
-
an object name repository storing a plurality of object names; a visual content repository storing a plurality of visual content items; a recognition repository storing associations of object names with feature vectors and with a number of dimensionality reduction rounds; an analysis module adapted to; for each of a plurality of object names form the object name repository, automatically select a plurality of visual content items from the visual content repository, extract feature vectors from the visual content items, and perform a number of dimensionality reduction rounds on the feature vectors, each round producing reduced feature vectors as input for the next round, thereby producing multiple sets of reduced feature vectors for each object name; for each object name, perform consistency learning on the sets of reduced feature vectors, until one of the sets of reduced feature vectors for the object name has a minimum measure of similarity to the other feature vectors associated with the object name; and store as the classification model for each object name, the set of reduced feature vectors which have the minimum measure of similarity. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A non-transitory computer readable storage medium storing a computer program executable by a processor for generating a classification model of visual objects present in visual content items stored in a visual content repository, each visual content item having a textual description, the actions of the computer program comprising:
-
for each of a plurality of object names, automatically selecting a plurality of visual content items from the visual content repository, extracting feature vectors from the visual content items, and performing a number of dimensionality reduction rounds on the feature vectors, each round producing reduced feature vectors as input for the next round, thereby producing multiple sets of reduced feature vectors for each object name; for each object name, performing consistency learning on the sets of reduced feature vectors, until one of the sets of reduced feature vectors for the object name has a minimum measure of similarity to the other feature vectors associated with the object name; and storing as the classification model for each object name, the set of reduced feature vectors which have the minimum measure of similarity. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
-
43. A computer implemented method of identifying visual content items relevant to a query, the method comprising:
-
storing a recognition repository having; a plurality of object names, and a plurality of associations between an object name, a visual content item, and a probability that the visual content item contains a visual representation corresponding to the object name; receiving a query comprising an object name; and identifying a plurality of visual content items having the highest probabilities of containing a visual representation of an object corresponding to the object name, based at least in part on the probabilities of the recognition repository.
-
Specification