Statistical approach to large-scale image annotation
First Claim
Patent Images
1. A method of annotating an image comprising:
- extracting and indexing both visual features and textual information from a plurality of images and a two step probabilistic modeling technique comprising identifying words from the textual information as candidate annotations and annotating the image with the candidate annotations that have the highest average conditional probabilities;
hashing the plurality of visual features;
clustering the plurality of images based at least in part on hash values derived from the hashing, the clustering creating clustered images;
building one or more statistical language models based at least in part on the visual features and the textual information of the clustered images; and
annotating the image using one or more of the statistical language models.
4 Assignments
0 Petitions
Accused Products
Abstract
Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.
-
Citations
16 Claims
-
1. A method of annotating an image comprising:
-
extracting and indexing both visual features and textual information from a plurality of images and a two step probabilistic modeling technique comprising identifying words from the textual information as candidate annotations and annotating the image with the candidate annotations that have the highest average conditional probabilities; hashing the plurality of visual features; clustering the plurality of images based at least in part on hash values derived from the hashing, the clustering creating clustered images; building one or more statistical language models based at least in part on the visual features and the textual information of the clustered images; and annotating the image using one or more of the statistical language models. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer readable storage medium comprising computer executable instructions that when executed by a processor perform a method comprising:
-
crawling a large-scale image database to gather a plurality of images; compiling visual features and textual information from the plurality of images; extracting visual information from the plurality of images by using a gray block methodology; reducing the visual information by employing a projection matrix; hashing the reduced visual information, and clustering the plurality of images based on a hash value; building one or more statistical language models based on the clustered images; and annotating a query image using one or more of the statistical language models comprising a bigram model that calculates an average conditional probability that a second word is associated with the clustered images given a first word already associated with the clustered images. - View Dependent Claims (11, 12, 13)
-
-
14. A computer readable storage medium comprising:
-
a digital image; a textual annotation corresponding to the digital image; and executable instructions that when executed by a processor, associate the textual annotation with the digital image by; compiling visual features and textual information from a plurality of images; extracting visual information from the plurality of images by using a gray block methodology; hashing the plurality of visual features, wherein hashing the reduced visual information comprises a vector quantization process in which the visual features are transformed into a binary string; clustering the plurality of images based on the hash value; building one or more statistical language models based on the clustered images; and annotating the image using one or more of the statistical language models comprising a unigram model that calculates a probability that a word is associated with the image based at least in part on (1) a visual similarity between the image and the clustered images and (2) a prior probability of the clustered images. - View Dependent Claims (15, 16)
-
Specification