INSTANCE-WEIGHTED MIXTURE MODELING TO ENHANCE TRAINING COLLECTIONS FOR IMAGE ANNOTATION

US 20140307958A1
Filed: 04/16/2014
Published: 10/16/2014
Est. Priority Date: 04/16/2013
Status: Active Grant

First Claim

Patent Images

1. A method of improving the precision of training data used in automated image annotation, categorization, recognition, understanding, or retrieval, comprising the steps of:

providing a digital computer;

receiving at the computer a plurality of previously tagged digital images;

executing an algorithm on the computer to perform the following operations;

(a) extracting visual and textual features from the tagged images,(b) variably weighting the images based upon the extracted features,(c) computing a reference model for the images based on weighted instances through one or multiple iterations,(d) retaining images with high likelihood of correct tagging based upon the reference model; and

using the retained images to train an automated image annotation, categorization, recognition, understanding, or retrieval system.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Automatic selection of training images is enhanced using an instance-weighted mixture modeling framework called ARTEMIS. An optimization algorithm is derived that in addition to mixture parameter estimation learns instance-weights, essentially adapting to the noise associated with each example. The mechanism of hypothetical local mapping is evoked so that data in diverse mathematical forms or modalities can be cohesively treated as the system maintains tractability in optimization. Training examples are selected from top-ranked images of a likelihood-based image ranking. Experiments indicate that ARTEMIS exhibits higher resilience to noise than several baselines for large training data collection. The performance of ARTEMIS-trained image annotation system is comparable to using manually curated datasets.

49 Citations

View as Search Results

9 Claims

1. A method of improving the precision of training data used in automated image annotation, categorization, recognition, understanding, or retrieval, comprising the steps of:
- providing a digital computer;
  
  receiving at the computer a plurality of previously tagged digital images;
  
  executing an algorithm on the computer to perform the following operations;
  
  (a) extracting visual and textual features from the tagged images,(b) variably weighting the images based upon the extracted features,(c) computing a reference model for the images based on weighted instances through one or multiple iterations,(d) retaining images with high likelihood of correct tagging based upon the reference model; and
  
  using the retained images to train an automated image annotation, categorization, recognition, understanding, or retrieval system.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, including the steps of:
    - providing a reference model for a plurality of image concepts, each model being a mixture of visual and textual features computed from images tagged with the target concept; and
      
      performing the following operations on the digital computer;
      
      an initialization step wherein equal weights are assigned to all data instances,systematically learning unequal weights to curb the contribution of noisy images in iterative reference model learning, andselecting the training data by ranking images in the decreasing order of mixture likelihood.
  - 3. The method of claim 2, wherein the images are processed one concept at a time.
  - 4. The method of claim 2, wherein an image labeled with multiple tags is used in learning reference models of all those concepts.
  - 5. The method of claim 2, wherein all concept reference models are stored in a database for future use.
  - 6. The method of claim 1, wherein the images are noisy, user-tagged images obtained from the Internet.
  - 7. The method of claim 1, wherein the algorithm converges to identify the high-density region of relevant images, thereby improving the precision of training data selection.
  - 8. The method of claim 1, wherein:
    - the algorithm uses a parametric probabilistic data model; and
      
      the method includes the step of ranking based on overall mixture likelihood.
  - 9. The method of claim 1, wherein hypothetical local mapping is evoked so that data in diverse mathematical forms or modalities can be cohesively treated as the process maintains tractability in optimization.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Penn State Research Foundation
Original Assignee
Penn State Research Foundation
Inventors
Wang, James Z., Li, Jia, Sawant, Neela

Granted Patent

US 9,646,226 B2
Time in Patent Office

Days
Field of Search
US Class Current

382/159
CPC Class Codes

G06F 18/214 Generating training pattern...

INSTANCE-WEIGHTED MIXTURE MODELING TO ENHANCE TRAINING COLLECTIONS FOR IMAGE ANNOTATION

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

49 Citations

9 Claims

Specification

Use Cases

Quick Links

Others

INSTANCE-WEIGHTED MIXTURE MODELING TO ENHANCE TRAINING COLLECTIONS FOR IMAGE ANNOTATION

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

49 Citations

9 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others