Tracking semantic objects in vector image sequences
First Claim
1. A method for tracking video objects in video frames, the method comprising:
- performing spatial segmentation on a video frame to identify regions of pixels with homogenous intensity values;
performing motion estimation between each of the regions in the video frame and a previous video frame;
using the motion estimation for each region to warp pixel locations in each region to locations in the previous frame;
determining whether the warped pixel locations are within a boundary of a segmented video object in the previous frame to identify a set of the regions that are likely to be part of the video object; and
forming a boundary of the video object in the video frame as a combination of each of the regions in the video frame that are in the set.
1 Assignment
0 Petitions
Accused Products
Abstract
A semantic object tracking method tracks general semantic objects with multiple non-rigid motion, disconnected components and multiple colors throughout a vector image sequence. The method accurately tracks these general semantic objects by spatially segmenting image regions from a current frame and then classifying these regions as to which semantic object they originated from in the previous frame. To classify each region, the method perform a region based motion estimation between each spatially segmented region and the previous frame to computed the position of a predicted region in the previous frame. The method then classifies each region in the current frame as being part of a semantic object based on which semantic object in the previous frame contains the most overlapping points of the predicted region. Using this method, each region in the current image is tracked to one semantic object from the previous frame, with no gaps or overlaps. The method propagates few or no errors because it projects regions into a frame where the semantic object boundaries are previously computed rather than trying to project and adjust a boundary in a frame where the object'"'"'s boundary is unknown.
111 Citations
20 Claims
-
1. A method for tracking video objects in video frames, the method comprising:
-
performing spatial segmentation on a video frame to identify regions of pixels with homogenous intensity values;
performing motion estimation between each of the regions in the video frame and a previous video frame;
using the motion estimation for each region to warp pixel locations in each region to locations in the previous frame;
determining whether the warped pixel locations are within a boundary of a segmented video object in the previous frame to identify a set of the regions that are likely to be part of the video object; and
forming a boundary of the video object in the video frame as a combination of each of the regions in the video frame that are in the set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer readable medium having instructions for tracking semantic objects in a vector image sequence of image frames, the medium comprising:
-
a spatial segmentation module for segmenting a vector image frame in the image sequence into regions, each region comprising connected groups of image points having image values that satisfy a homogeneity criterion;
a motion estimator module for estimating the motion between each of the regions in the input image frame and a reference frame and for determining a motion parameter that approximates the motion of each region between the image frame and the target frame; and
a region classifier for applying the motion parameter of each region to the region to compute a predicted region in the reference frame, for evaluating whether a boundary of each predicted region falls at least partially within a boundary of a semantic object of the reference frame, and classifying each region as being part of semantic object in the reference frame based on the extent to which the predicted region falls within the boundary of a semantic object boundary of the reference frame;
wherein a boundary of a semantic object in the image frame is formed from each region classified as being part of a corresponding semantic object in the reference frame. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A method for tracking semantic objects in vector image sequences, the method comprising:
-
performing spatial segmentation on an image frame to identify regions of discrete image points with homogenous image values;
performing motion estimation between each of the regions in the image frame and a target image frame in which a boundary of a semantic object is known;
using the motion estimation for each region to warp the image points in each region to locations in the target frame;
determining whether the warped pixel locations of each region are within a boundary of a semantic object in the target frame and when at least a threshold amount of the region overlaps a semantic object in the target frame, classifying the region as originating from the semantic object in the target frame; and
forming a boundary of the semantic object in the image frame as a combination of each of the regions in the image frame that are classified as originating from the semantic object of the target frame. - View Dependent Claims (16, 17, 18, 19)
-
-
20. A method for tracking semantic objects in vector image sequences, the method comprising:
-
performing spatial segmentation on an image frame to identify regions of discrete image points with homogenous image values, where each of the regions are connected group of image points, and where each region is determined to be homogenous by only adding neighboring image points to the region where the difference in intensity values between a maximum and minimum image value in the region after adding each neighboring image point is below a threshold;
performing region based motion estimation between each of the regions in the image frame and an immediate previous image frame in the vector image sequence;
using the motion estimated for each region to warp the image points in each region to locations in the immediate previous frame;
determining whether the warped pixel locations of each region are within a boundary of a semantic object in the target frame and when at least a threshold amount of the region overlaps a semantic object in the target frame, classifying the region as originating from the semantic object in the target frame; and
forming a boundary for each semantic object in the image frame as a combination of each of the regions in the image frame that are classified as originating from the semantic object of the immediate previous frame.
-
Specification