Apparatus and method for processing video data
First Claim
1. A computer-implemented method for processing video signal data from a plurality of video frames, the method comprising:
- detecting an object in two or more given video frames, each video frame being formed of pel data;
tracking the detected object through the two or more video frames;
segmenting pel data corresponding to the detected object from other pel data in the two or more video frames so as to generate a first intermediate form of the video signal data, the segmenting utilizing a spatial segmentation of the pel data;
generating correspondence models of elements of the detected object, each correspondence model relating an element of the detected object in one video frame to a corresponding element of the detected object in another video frame; and
using the correspondence models, normalizing the segmented pel data, said normalizing including modeling global motion of the detected object and resulting in re-sampled pel data corresponding to the detected object in the two or more video frames, the re-sampled pel data providing an object-based encoded form of the video signal data normalized as output;
the object-based encoded form being able to be decoded by;
(i) restoring spatial positions of the re-sampled pel data by utilizing the correspondence models, thereby generating restored pels corresponding to the detected object; and
(ii) recombining the restored pel data together with the other pel data in the first intermediate form of the video signal data to re-create an original video frame; and
wherein generating correspondence models includes estimating a multi-dimensional projective motion model.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus and methods for processing video data are described. The invention provides a representation of video data that can be used to assess agreement between the data and a fitting model for a particular parameterization of the data. This allows the comparison of different parameterization techniques and the selection of the optimum one for continued video processing of the particular data. The representation can be utilized in intermediate form as part of a larger process or as a feedback mechanism for processing video data. When utilized in its intermediate form, the invention can be used in processes for storage, enhancement, refinement, feature extraction, compression, coding, and transmission of video data. The invention serves to extract salient information in a robust and efficient manner while addressing the problems typically associated with video data sources.
-
Citations
21 Claims
-
1. A computer-implemented method for processing video signal data from a plurality of video frames, the method comprising:
-
detecting an object in two or more given video frames, each video frame being formed of pel data;
tracking the detected object through the two or more video frames;
segmenting pel data corresponding to the detected object from other pel data in the two or more video frames so as to generate a first intermediate form of the video signal data, the segmenting utilizing a spatial segmentation of the pel data;
generating correspondence models of elements of the detected object, each correspondence model relating an element of the detected object in one video frame to a corresponding element of the detected object in another video frame; and
using the correspondence models, normalizing the segmented pel data, said normalizing including modeling global motion of the detected object and resulting in re-sampled pel data corresponding to the detected object in the two or more video frames, the re-sampled pel data providing an object-based encoded form of the video signal data normalized as output;
the object-based encoded form being able to be decoded by;
(i) restoring spatial positions of the re-sampled pel data by utilizing the correspondence models, thereby generating restored pels corresponding to the detected object; and
(ii) recombining the restored pel data together with the other pel data in the first intermediate form of the video signal data to re-create an original video frame; and
wherein generating correspondence models includes estimating a multi-dimensional projective motion model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method of generating an encoded form of video signal data from a plurality of video frames, the method comprising:
-
detecting an object in two or more video frames of the plurality of video frames, each video frame being formed of pel data;
tracking the detected object through the two or more video frames, the detected object having one or more elements;
for an element of the detected object in one video frame, identifying a corresponding element of the detected object in the other video frames;
analyzing the corresponding elements to generate relationships between the corresponding elements;
forming correspondence models for the detected object by using the generated relationships between the corresponding elements;
normalizing pel data corresponding to the detected object in the two or more video frames by utilizing the formed correspondence models and a deformable mesh, said normalizing generating re-sampled pel data representing an object-based encoded form of the video signal data; and
rendering the object-based encoded form of the video signal data for subsequent use, the object-based encoded form enabling restoring of spatial positions of the re-sampled pel data by utilizing the correspondence models, and generating restored pel data of the detected object;
wherein the detecting and tracking comprise using any one or combination of a Viola/Jones face detection algorithm and Principle Component Analysis. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
Specification