Statistical approach to large-scale image annotation

US 8,594,468 B2
Filed: 02/28/2012
Issued: 11/26/2013
Est. Priority Date: 05/30/2008
Status: Active Grant

First Claim

Patent Images

1. A method of annotating a personal image comprising:

compiling visual features and textual information from a plurality of images;

hashing the visual features;

clustering the plurality of images based at least in part on a hash value, the clustering creating clustered images;

building one or more statistical language models based at least in part on the clustered images; and

annotating the personal image by selecting words with a maximum joint probability between the personal image and the clustered images.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.

25 Citations

View as Search Results

20 Claims

1. A method of annotating a personal image comprising:
- compiling visual features and textual information from a plurality of images;
  
  hashing the visual features;
  
  clustering the plurality of images based at least in part on a hash value, the clustering creating clustered images;
  
  building one or more statistical language models based at least in part on the clustered images; and
  
  annotating the personal image by selecting words with a maximum joint probability between the personal image and the clustered images.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 20)
- - 2. A method as recited in claim 1, wherein the hashing the visual features comprises using vector quantization to transform the visual features into a binary string.
  - 3. A method as recited in claim 1, wherein the plurality of images with a same hash value are grouped into the clustered images.
  - 4. A method as recited in claim 1, wherein the one or more statistical language models is a unigram model that calculates a probability that a word is associated with the personal image based at least in part on a visual similarity between the personal image and the clustered images and a prior probability of the clustered images.
  - 5. A method as recited in claim 1, wherein the one or more statistical language models is a bigram model that calculates an average conditional probability that a second word is associated with the clustered images given a first word already associated with the clustered images.
  - 6. A method as recited in claim 1, further comprising extracting the visual features from the plurality of images by using a gray block methodology.
  - 7. A method as recited in claim 6, wherein the gray block methodology comprises:
    - partitioning each of the plurality of images into blocks,measuring an average luminance for each block, andrepresenting each of the plurality of images as a vector.
  - 8. A method as recited in claim 6, further comprising reducing the visual features of the plurality of images by employing a projection matrix.
  - 20. A method as recited in claim 1, wherein the one or more statistical language models is a unigram model, the method further comprising smoothing the unigram model using Bayesian models using Dirichlet priors.

9. A computer readable storage device comprising computer executable instructions that when executed by one or more processors cause one or more computing devices to perform a method comprising:
- compiling visual information and textual information from a plurality of images;
  
  extracting the visual information from the plurality of images by using a gray block methodology;
  
  reducing the visual information by employing a projection matrix;
  
  hashing the reduced visual information;
  
  clustering the plurality of images based at least in part on a hash value to create image clusters;
  
  building one or more statistical language models based at least in part on the image clusters; and
  
  annotating a personal image by selecting words with a maximum joint probability with the personal image.
- View Dependent Claims (10, 11, 12, 13, 14)
- - 10. A computer readable storage device as recited in claim 9, wherein the hashing the reduced visual information comprises using vector quantization to transform the visual information into a binary string.
  - 11. A computer readable storage device as recited in claim 9, wherein the plurality of images with a same hash value are grouped into a same image cluster.
  - 12. A computer readable storage device as recited in claim 11 further comprising, comparing a hash value of the personal image with hash values of the image clusters.
  - 13. A computer readable storage device as recited in claim 9, wherein the one or more statistical language models is a unigram model that calculates a probability that a word is associated with the personal image based at least in part on a visual similarity between the personal image and the image clusters and a prior probability of the image clusters.
  - 14. A computer readable storage device as recited in claim 9, wherein the one or more statistical language models is a bigram model that calculates an average conditional probability that a second word is associated with the image clusters given a first word already associated with the image clusters.

15. A computer readable storage device comprising:
- a personal image; and
  
  a textual annotation associated with the personal image, the textual annotation being associated with the personal image by;
  
  compiling visual features and textual information from a plurality of images;
  
  extracting the visual features from the plurality of images by using a gray block methodology;
  
  hashing the visual features to generate a hash value;
  
  clustering the plurality of images based at least in part on the hash value;
  
  building one or more statistical language models based at least in part on the clustered images; and
  
  associating the textual annotation with the personal image by selecting words with a maximum joint probability between the personal image and the clustered images.
- View Dependent Claims (16, 17, 18, 19)
- - 16. A computer readable storage device as recited in claim 15, wherein the hashing the visual features comprises using vector quantization to transform the visual features into a binary string.
  - 17. A computer readable storage device as recited in claim 15, wherein the gray block methodology comprises:
    - partitioning each of the plurality of images into blocks,measuring an average luminance for each block, andrepresenting each of the plurality of image as a vector.
  - 18. A computer readable storage device as recited in claim 15, further comprising reducing the visual features of the plurality of images by employing a projection matrix.
  - 19. A computer readable storage device as recited in claim 15, wherein the one or more statistical language models is a unigram model, the method further comprising smoothing the unigram model using Bayesian models using Dirichlet priors.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Li, Mingjing, Rui, Xiaoguang
Primary Examiner(s)
Wu, Jingge

Application Number

US13/406,804
Publication Number

US 20120155774A1
Time in Patent Office

637 Days
Field of Search

382305-306, 382/228
US Class Current

382/305
CPC Class Codes

G06V 20/35   Categorising the entire sce...

G06V 20/70   Labelling scene content, e....

G06V 2201/10   Recognition assisted with m...

Statistical approach to large-scale image annotation

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

25 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Statistical approach to large-scale image annotation

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

25 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others