VIDEO-BASED DETECTION OF MULTIPLE OBJECT TYPES UNDER VARYING POSES
First Claim
1. A method for object detection as a function of a motion direction attribute, the method comprising:
- clustering training data set object images corresponding to object motion blobs into each of a plurality of motionlet sets as a function of similarity of their associated motion direction attributes, each of the motionlet sets comprising object image associated with similar motion direction attributes that are distinguished from the motion direction attributes of the object image blobs in others of the motionlet sets;
resizing the clustered motionlet pluralities of object images from their respective original aspect ratios into a same aspect ratio, wherein the motionlet object images may have different original respective aspect ratios;
learning motionlet detectors for each of the motionlet sets from features extracted from the resized training blobs and from sets of negative images of non-object image patches of the same aspect ratio obtained from background images;
applying a deformable sliding window to detect an object blob in an input video obtained by background modeling by varying at least one of a size, a shape and an aspect ratio of the sliding window to conform to a shape of the detected input video object blob;
extracting a motion direction of an underlying image patch of the detected input video object blob;
selecting at least one of the motionlet detectors that has a motion direction similar to the motion direction extracted from an underlying image patch of the input video object blob;
applying the selected at least one motionlet detector to the detected input video object blob;
determining that an object has been detected within the detected input video object blob and extracting semantic attributes of the underlying image patch of the input video object blob if a one of the selected and applied at least one motionlet detectors fires; and
storing the extracted semantic attributes of the underlying image patch of the input video object blob in a database for searching for the detected object as a function of its extracted semantic attributes.
2 Assignments
0 Petitions
Accused Products
Abstract
Training data object images are clustered as a function of motion direction attributes and resized from respective original into same aspect ratios. Motionlet detectors are learned for each of the sets from features extracted from the resized object blobs. A deformable sliding window is applied to detect an object blob in input by varying window size, shape or aspect ratio to conform to a shape of the detected input video object blob. A motion direction of an underlying image patch of the detected input video object blob is extracted and motionlet detectors selected and applied that have similar motion directions. An object is thus detected within the detected blob and semantic attributes of an underlying image patch extracted if a motionlet detectors fires, the extracted semantic attributes available for use for searching for the detected object.
-
Citations
20 Claims
-
1. A method for object detection as a function of a motion direction attribute, the method comprising:
-
clustering training data set object images corresponding to object motion blobs into each of a plurality of motionlet sets as a function of similarity of their associated motion direction attributes, each of the motionlet sets comprising object image associated with similar motion direction attributes that are distinguished from the motion direction attributes of the object image blobs in others of the motionlet sets; resizing the clustered motionlet pluralities of object images from their respective original aspect ratios into a same aspect ratio, wherein the motionlet object images may have different original respective aspect ratios; learning motionlet detectors for each of the motionlet sets from features extracted from the resized training blobs and from sets of negative images of non-object image patches of the same aspect ratio obtained from background images; applying a deformable sliding window to detect an object blob in an input video obtained by background modeling by varying at least one of a size, a shape and an aspect ratio of the sliding window to conform to a shape of the detected input video object blob; extracting a motion direction of an underlying image patch of the detected input video object blob; selecting at least one of the motionlet detectors that has a motion direction similar to the motion direction extracted from an underlying image patch of the input video object blob; applying the selected at least one motionlet detector to the detected input video object blob; determining that an object has been detected within the detected input video object blob and extracting semantic attributes of the underlying image patch of the input video object blob if a one of the selected and applied at least one motionlet detectors fires; and storing the extracted semantic attributes of the underlying image patch of the input video object blob in a database for searching for the detected object as a function of its extracted semantic attributes. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
a processing unit, computer readable memory and a computer readable storage medium; first program instructions to cluster training data set object images corresponding to object motion blobs into each of a plurality of motionlet sets as a function of similarity of their associated motion direction attributes, each of the motionlet sets comprising object image associated with similar motion direction attributes that are distinguished from the motion direction attributes of the object image blobs in others of the motionlet sets, the pluralities of the motionlet object images resized from their respective original aspect ratios into a same aspect ratio, wherein the motionlet object images may have different original respective aspect ratios; second program instructions to learn motionlet detectors for each of the motionlet sets from features extracted from the resized training blobs and from sets of negative images of non-object image patches of the same aspect ratio obtained from background images; third program instructions to apply a deformable sliding window to detect an object blob in an input video obtained by background modeling by varying at least one of a size, a shape and an aspect ratio of the sliding window to conform to a shape of the detected input video object blob, and to extract a motion direction of an underlying image patch of the detected input video object blob; and fourth program instructions to select at least one of the motionlet detectors that has a motion direction similar to the motion direction extracted from an underlying image patch of the input video object blob, apply the selected at least one motionlet detector to the detected input video object blob and determine that an object has been detected within the detected input video object blob and extract semantic attributes of the underlying image patch of the input video object blob if a one of the selected and applied at least one motionlet detectors fires, and to store the extracted semantic attributes of the underlying image patch of the input video object blob in a database for searching for the detected object as a function of its extracted semantic attributes; wherein the first, second, third and fourth program instructions are stored on the computer readable storage medium for execution by the processing unit via the computer readable memory. - View Dependent Claims (9, 10, 11, 12)
-
-
13. An article of manufacture, comprising:
-
a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising instructions that, when executed by a computer processor, cause the computer processor to; cluster training data set object images corresponding to object motion blobs into each of a plurality of motionlet sets as a function of similarity of their associated motion direction attributes, each of the motionlet sets comprising object image associated with similar motion direction attributes that are distinguished from the motion direction attributes of the object image blobs in others of the motionlet sets, the pluralities of the motionlet object images resized from their respective original aspect ratios into a same aspect ratio, wherein the motionlet object images may have different original respective aspect ratios; learn motionlet detectors for each of the motionlet sets from features extracted from the resized training blobs and from sets of negative images of non-object image patches of the same aspect ratio obtained from background images; apply a deformable sliding window to detect an object blob in an input video obtained by background modeling by varying at least one of a size, a shape and an aspect ratio of the sliding window to conform to a shape of the detected input video object blob, and to extract a motion direction of an underlying image patch of the detected input video object blob; and select at least one of the motionlet detectors that has a motion direction similar to the motion direction extracted from an underlying image patch of the input video object blob, apply the selected at least one motionlet detector to the detected input video object blob; determine that an object has been detected within the detected input video object blob and extract semantic attributes of the underlying image patch of the input video object blob if a one of the selected and applied at least one motionlet detectors fires; and store the extracted semantic attributes of the underlying image patch of the input video object blob in a database for searching for the detected object as a function of its extracted semantic attributes. - View Dependent Claims (14, 15, 16)
-
-
17. A method of providing a service for object detection as a function of a motion direction attribute, the method comprising providing:
-
a motionlet splitter that clusters training data set object images corresponding to object motion blobs into each of a plurality of motionlet sets as a function of similarity of their associated motion direction attributes, each of the motionlet sets comprising object image associated with similar motion direction attributes that are distinguished from the motion direction attributes of the object image blobs in others of the motionlet sets; an aspect ratio resizer that resizes the clustered motionlet pluralities of object images from their respective original aspect ratios into a same aspect ratio, wherein the motionlet object images may have different original respective aspect ratios; a motionlet detector builder that builds motionlet detectors for each of the motionlet sets from features extracted from the resized training blobs and from sets of negative images of non-object image patches of the same aspect ratio obtained from background images; a sliding window applicator that detects an image blob in an input video and deforms a sliding window to frame about the detected blob in response to a shape of the detected blob by varying at least one of a size, a shape and an aspect ratio of the sliding window to conform to the shape of the detected blob; and a feature extractor that extracts a motion direction of an underlying image patch of the detected input video object blob, selects at least one of the motionlet detectors that has a motion direction similar to the motion direction extracted from an underlying image patch of the input video object blob, applies the selected at least one motionlet detector to the detected input video object blob, determines that an object has been detected within the detected input video object blob and extracts semantic attributes of the underlying image patch of the input video object blob if a one of the selected and applied at least one motionlet detectors fires, and stores the extracted semantic attributes of the underlying image patch of the input video object blob in a database for searching for the detected object as a function of its extracted semantic attributes. - View Dependent Claims (18, 19, 20)
-
Specification