Object detection and recognition system

US 7,912,288 B2
Filed: 09/21/2006
Issued: 03/22/2011
Est. Priority Date: 09/21/2006
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of object detection and recognition comprising:

receiving an image from an input device coupled to a processor and memory to undergo object detection and recognition; and

generating, by the processor, a part label map for the received image, the part label map comprising, for each image element of the received image, a label indicating which of a plurality of parts that image element is assigned to, each part being a densely represented image area, wherein generating the part label map comprises at least;

accessing a pre-specified classifier stored in the memory and configured to estimate a belief distribution over parts for each image element of the received image,the classifier formed during a training phase using a plurality of training images together with a mask for each training image indicating which pixels in the training image correspond to objects to be recognized and which correspond to background that is not required to be recognized,during the training phase, forming an initial part label map for a training image by dividing the image into a plurality of parts having a consistent pair-wise ordering such that the parts contiguously cover the image;

using an inference algorithm stored in the memory to infer the part label map from a conditional random field by forcing a global part labeling which is substantially layout-consistent; and

ensuring that the parts meet constraints related to image elements, the image elements being non-immediate neighbors.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

During a training phase we learn parts of images which assist in the object detection and recognition task. A part is a densely represented area of an image of an object to which we assign a unique label. Parts contiguously cover an image of an object to give a part label map for that object. The parts do not necessarily correspond to semantic object parts. During the training phase a classifier is learnt which can be used to estimate belief distributions over parts for each image element of a test image. A conditional random field is used to force a global part labeling which is substantially layout-consistent and a part label map is inferred from this. By recognizing parts we enable object detection and recognition even for partially occluded objects, for multiple-objects of different classes in the same scene, for unstructured and structured objects and allowing for object deformation.

46 Citations

View as Search Results

17 Claims

1. A computer-implemented method of object detection and recognition comprising:
- receiving an image from an input device coupled to a processor and memory to undergo object detection and recognition; and
  
  generating, by the processor, a part label map for the received image, the part label map comprising, for each image element of the received image, a label indicating which of a plurality of parts that image element is assigned to, each part being a densely represented image area, wherein generating the part label map comprises at least;
  
  accessing a pre-specified classifier stored in the memory and configured to estimate a belief distribution over parts for each image element of the received image,the classifier formed during a training phase using a plurality of training images together with a mask for each training image indicating which pixels in the training image correspond to objects to be recognized and which correspond to background that is not required to be recognized,during the training phase, forming an initial part label map for a training image by dividing the image into a plurality of parts having a consistent pair-wise ordering such that the parts contiguously cover the image;
  
  using an inference algorithm stored in the memory to infer the part label map from a conditional random field by forcing a global part labeling which is substantially layout-consistent; and
  
  ensuring that the parts meet constraints related to image elements, the image elements being non-immediate neighbors.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The computer-implemented method as claimed in claim 1 wherein the inference algorithm comprises an annealed expansion move algorithm.
  - 3. The computer-implemented method as claimed in claim 1 wherein the inference algorithm comprises belief propagation.
  - 4. The computer-implemented method as claimed in claim 1 wherein the conditional random field comprises a hidden layer of part labels.
  - 5. The computer-implemented method as claimed in claim 1 which is configured for detecting and recognizing images of partially occluded objects.
  - 6. The computer-implemented method as claimed in claim 1 which further comprises deforming the part label map for the training image during a learning process to form a deformed part labeling such that parts which assist in the object detection and recognition task are learned.
  - 7. The computer-implemented method as claimed in claim 6 which further comprises using the deformed part labeling to form a new initial part label map for each training image and repeating the learning process.

8. One or more computer-readable storage media, which media are not a signal, comprising computer-executable instructions that, when executed by a processor, perform acts for object detection and recognition comprising:
- receiving an image from an input device coupled to a processor and memory to undergo object detection and recognition, the image being of partially occluded objects;
  
  accessing a pre-specified classifier stored in the memory arranged to estimate a belief distribution over parts for each image element of the received image,the classifier formed during a training phase using a plurality of training images together with a mask for each training image indicating which pixels in the training image correspond to objects to be recognized and which correspond to background that is not required to be recognized,during the training phase, forming an initial part label map for a training image by dividing the image into a plurality of parts having a consistent pair-wise ordering such that the parts contiguously cover the image; and
  
  ensuring that the parts meet constraints related to image elements, the image elements being non-immediate neighbors; and
  
  applying an inference process to a conditional random field model stored in the memory to force a global part labeling which is substantially layout-consistent and thus generating a part label map from the conditional random field model for the received image, the part label map comprising, for each image element of the received image, a label indicating which of a plurality of parts the image element is assigned to, each part being a densely represented image area.
- View Dependent Claims (9, 10, 16)
- - 9. The one or more computer-readable storage media as claimed in claim 8, wherein using a conditional random field model comprises using such a model having a hidden layer of part labels.
  - 10. The one or more computer-readable storage media as claimed in claim 8, wherein using a conditional random field model comprises using a plurality of decision trees.
  - 16. The one or more computer-readable storage media as claimed in claim 8, further comprising deforming the part label map for the training image during a learning process to form a deformed part labeling such that parts which assist in the object detection and recognition task are learned.

11. An apparatus for object detection and recognition comprising:
- memory and a processor;
  
  an input device coupled to the processor and configured to receive an image to undergo object detection and recognition;
  
  an input device coupled to the processor and configured to access a pre-specified classifier stored in the memory, the classifier configured to estimate a belief distribution over parts for each image element of the received image;
  
  a conditional random field model stored in the memory; and
  
  an inference mechanism coupled to the processor and configured to carry out an inference process on the conditional random field model to force a global part labeling which is substantially layout-consistent and thereby generate a part label map for the received image, the part label map comprising, for each image element of the received image, a label indicating which of a plurality of parts the image element is assigned to, each part being a densely represented image area;
  
  the processor being configured to;
  
  form the classifier during a training phase using a plurality of training images together with a mask for each training image indicating which pixels in the training image correspond to objects to be recognized and which correspond to background that is not required to be recognized;
  
  during the training phase, form an initial part label map for a training image by dividing the image into a plurality of parts having a consistent pair-wise ordering such that the parts contiguously cover the image; and
  
  ensure that the parts meet constraints related to image elements, the image elements being non-immediate neighbors.
- View Dependent Claims (12, 13, 14, 15, 17)
- - 12. An apparatus as claimed in claim 11 wherein the inference mechanism is arranged to carry out an annealed expansion move algorithm.
  - 13. An apparatus as claimed in claim 11 wherein the conditional random field model comprises a hidden layer of part labels.
  - 14. An apparatus as claimed in claim 11 wherein the classifier comprises a plurality of decision trees.
  - 15. An apparatus as claimed in claim 11 wherein the processor is further configured to use the deformed part labeling to form a new initial part label map for each training image and repeating the learning process.
  - 17. An apparatus as claimed in claim 11, wherein the processor is further configured to deform the part label map for the training image during a learning process to form a deformed part labeling such that parts which assist in the object detection and recognition task are learned.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Zhigu Holdings Limited
Original Assignee
Microsoft Corporation
Inventors
Shotton, Jamie, Winn, John
Primary Examiner(s)
Ahmed; Samir A
Assistant Examiner(s)
Li; Ruiping

Application Number

US11/533,993
Publication Number

US 20080075367A1
Time in Patent Office

1,643 Days
Field of Search

382/181, 382/118, 382/101, 382/224
US Class Current

382/181
CPC Class Codes

G06V 10/267 by performing operations on...

G06V 10/457 by analysing connectivity, ...

Object detection and recognition system

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

46 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Object detection and recognition system

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

46 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links