Object detection and recognition system
First Claim
1. A computer-implemented method of object detection and recognition comprising:
- receiving an image from an input device coupled to a processor and memory to undergo object detection and recognition; and
generating, by the processor, a part label map for the received image, the part label map comprising, for each image element of the received image, a label indicating which of a plurality of parts that image element is assigned to, each part being a densely represented image area, wherein generating the part label map comprises at least;
accessing a pre-specified classifier stored in the memory and configured to estimate a belief distribution over parts for each image element of the received image,the classifier formed during a training phase using a plurality of training images together with a mask for each training image indicating which pixels in the training image correspond to objects to be recognized and which correspond to background that is not required to be recognized,during the training phase, forming an initial part label map for a training image by dividing the image into a plurality of parts having a consistent pair-wise ordering such that the parts contiguously cover the image;
using an inference algorithm stored in the memory to infer the part label map from a conditional random field by forcing a global part labeling which is substantially layout-consistent; and
ensuring that the parts meet constraints related to image elements, the image elements being non-immediate neighbors.
3 Assignments
0 Petitions
Accused Products
Abstract
During a training phase we learn parts of images which assist in the object detection and recognition task. A part is a densely represented area of an image of an object to which we assign a unique label. Parts contiguously cover an image of an object to give a part label map for that object. The parts do not necessarily correspond to semantic object parts. During the training phase a classifier is learnt which can be used to estimate belief distributions over parts for each image element of a test image. A conditional random field is used to force a global part labeling which is substantially layout-consistent and a part label map is inferred from this. By recognizing parts we enable object detection and recognition even for partially occluded objects, for multiple-objects of different classes in the same scene, for unstructured and structured objects and allowing for object deformation.
46 Citations
17 Claims
-
1. A computer-implemented method of object detection and recognition comprising:
-
receiving an image from an input device coupled to a processor and memory to undergo object detection and recognition; and generating, by the processor, a part label map for the received image, the part label map comprising, for each image element of the received image, a label indicating which of a plurality of parts that image element is assigned to, each part being a densely represented image area, wherein generating the part label map comprises at least; accessing a pre-specified classifier stored in the memory and configured to estimate a belief distribution over parts for each image element of the received image, the classifier formed during a training phase using a plurality of training images together with a mask for each training image indicating which pixels in the training image correspond to objects to be recognized and which correspond to background that is not required to be recognized, during the training phase, forming an initial part label map for a training image by dividing the image into a plurality of parts having a consistent pair-wise ordering such that the parts contiguously cover the image; using an inference algorithm stored in the memory to infer the part label map from a conditional random field by forcing a global part labeling which is substantially layout-consistent; and ensuring that the parts meet constraints related to image elements, the image elements being non-immediate neighbors. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. One or more computer-readable storage media, which media are not a signal, comprising computer-executable instructions that, when executed by a processor, perform acts for object detection and recognition comprising:
-
receiving an image from an input device coupled to a processor and memory to undergo object detection and recognition, the image being of partially occluded objects; accessing a pre-specified classifier stored in the memory arranged to estimate a belief distribution over parts for each image element of the received image, the classifier formed during a training phase using a plurality of training images together with a mask for each training image indicating which pixels in the training image correspond to objects to be recognized and which correspond to background that is not required to be recognized, during the training phase, forming an initial part label map for a training image by dividing the image into a plurality of parts having a consistent pair-wise ordering such that the parts contiguously cover the image; and ensuring that the parts meet constraints related to image elements, the image elements being non-immediate neighbors; and applying an inference process to a conditional random field model stored in the memory to force a global part labeling which is substantially layout-consistent and thus generating a part label map from the conditional random field model for the received image, the part label map comprising, for each image element of the received image, a label indicating which of a plurality of parts the image element is assigned to, each part being a densely represented image area. - View Dependent Claims (9, 10, 16)
-
-
11. An apparatus for object detection and recognition comprising:
-
memory and a processor; an input device coupled to the processor and configured to receive an image to undergo object detection and recognition; an input device coupled to the processor and configured to access a pre-specified classifier stored in the memory, the classifier configured to estimate a belief distribution over parts for each image element of the received image; a conditional random field model stored in the memory; and an inference mechanism coupled to the processor and configured to carry out an inference process on the conditional random field model to force a global part labeling which is substantially layout-consistent and thereby generate a part label map for the received image, the part label map comprising, for each image element of the received image, a label indicating which of a plurality of parts the image element is assigned to, each part being a densely represented image area; the processor being configured to; form the classifier during a training phase using a plurality of training images together with a mask for each training image indicating which pixels in the training image correspond to objects to be recognized and which correspond to background that is not required to be recognized; during the training phase, form an initial part label map for a training image by dividing the image into a plurality of parts having a consistent pair-wise ordering such that the parts contiguously cover the image; and ensure that the parts meet constraints related to image elements, the image elements being non-immediate neighbors. - View Dependent Claims (12, 13, 14, 15, 17)
-
Specification