Visual Language Modeling for Image Classification
First Claim
1. A method at least partially implemented by a computing device, the method comprising:
- modeling images representing multiple image categories as respective matrices of visual words;
generating visual language models from the respective matrices of visual words;
estimating an image category for an image in view of the visual language models; and
presenting the image category or a result based on the image category to a user.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for visual language modeling for image classification are described. In one aspect the systems and methods model training images corresponding to multiple image categories as matrices of visual words. Visual language models are generated from the matrices. In view of a given image, for example, provided by a user or from the Web, the systems and methods determine an image category corresponding to the given image. This image categorization is accomplished by maximizing the posterior probability of visual words associated with the given image over the visual language models. The image category, or a result corresponding to the image category, is presented to the user.
-
Citations
20 Claims
-
1. A method at least partially implemented by a computing device, the method comprising:
-
modeling images representing multiple image categories as respective matrices of visual words; generating visual language models from the respective matrices of visual words; estimating an image category for an image in view of the visual language models; and presenting the image category or a result based on the image category to a user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable medium including computer-program instructions executable by a processor encoded thereon, the computer-program instructions when executed by the processor for performing operations comprising:
-
building visual language models from matrices of visual words generated from a set of training images, the visual language models being based on a visual word grammar, the training images corresponding to one or more predetermined image classifications; creating a visual document from an image for image categorization; determining an image category for the image based on characteristics of the visual document in view of the visual language models, and the image category corresponding to a classification of the one or more predetermined image classifications; and presenting the image category or a result based on the image category to a user. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A computing device comprising:
-
a processor; and a memory couple to the processor, the memory including computer-program instructions encoded thereon, the computer-program instructions, when executed by the processor, for performing operations comprising; loading a set of training images associated with corresponding image categories; for each training image of the training images; (a) dividing the training image into a respective set of image patches; (b) generating a visual word for each image patch to form a respective visual document for the training image; for each category of the one or more image categories, generating visual language model(s); estimating, using the visual language model(s), an image category for a given image; presenting the image category or a result corresponding to the image category to a user. - View Dependent Claims (18, 19, 20)
-
Specification