Method for segmenting multi-resolution video objects
First Claim
1. A method for segmenting a video including a plurality of pixels into a plurality of video objects, comprising:
- assigning a feature vector to each pixel of the video;
identifying selected pixels of the video as marker pixels;
assembling each marker pixel and pixels adjacent to the marker pixel into a corresponding a volume if the distance between the feature vector of the marker pixel and the feature vector of the adjacent pixels is less than a first predetermined threshold;
assigning a first score and descriptors to each volume;
sorting the volumes in a high-to-low order according to the first scores; and
processing the volumes in the high-to-low order, the processing for each volume comprising;
comparing the descriptor of the volume to the descriptor of an adjacent volume to determine a second score;
combining the volume with the adjacent volume if the second score passes a second threshold to generate a video object in a multi-resolution video object tree; and
repeating the comparing and combining steps until a single video representing the video remains.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for segmenting video objects in a video sequence that is composed of frames including pixels first assigns a feature vector to each pixel of the video. Next, selected pixels are identified as marker pixels. Pixels adjacent to each marker pixel are assembled into a corresponding a volume of pixels if the distance between the feature vector of the marker pixel and the feature vector of the adjacent pixels is less than a first predetermined threshold. After all pixels have been assembled into volumes, a first score and descriptors are assigned to each volume. At this point, each volume represents a segmented video object. The volumes are then sorted a high-to-low order according to the first scores, and further processed in the high-to-low order. Second scores, dependent on the descriptors of pairs of volumes are determined. The volumes are iteratively combined if the second score passes a second threshold to generate a video object in a resolution video object tree that completes when the combined volume or video object is the entire video.
41 Citations
12 Claims
-
1. A method for segmenting a video including a plurality of pixels into a plurality of video objects, comprising:
-
assigning a feature vector to each pixel of the video;
identifying selected pixels of the video as marker pixels;
assembling each marker pixel and pixels adjacent to the marker pixel into a corresponding a volume if the distance between the feature vector of the marker pixel and the feature vector of the adjacent pixels is less than a first predetermined threshold;
assigning a first score and descriptors to each volume;
sorting the volumes in a high-to-low order according to the first scores; and
processing the volumes in the high-to-low order, the processing for each volume comprising;
comparing the descriptor of the volume to the descriptor of an adjacent volume to determine a second score;
combining the volume with the adjacent volume if the second score passes a second threshold to generate a video object in a multi-resolution video object tree; and
repeating the comparing and combining steps until a single video representing the video remains. - View Dependent Claims (2, 3, 5, 6, 7, 8, 9, 10, 11)
-
-
4. The method of 3 further comprising:
applying a spatial-domain 2D median filter 210 to the frames 102 to remove intensity singularities, without disturbing edge formation.
-
12. A method for segmenting a video sequence of frames, each frame including a plurality of pixels, comprising:
-
partitioning all of the pixels of all frames of the video into a plurality of volumes according to features of each pixel, the pixels of each volume having frame-based spatial coordinates and sequence-based temporal coordinates;
assigning descriptors to each volume;
representing each volume as a video object at a lowest level in a multi-resolution video object tree; and
iteratively combining volumes according to the descriptors, and representing each combined volume as a video object at intermediate levels of the multi-resolution video object tree, until all of the combined volumes form the entire video represented as a video object at a highest level of the multi-resolution video object tree.
-
Specification