Feature-Based Video Compression
First Claim
1. A computer method of processing video data comprising the computer implemented steps of:
- receiving video data formed of a series of video frames; and
encoding portions of the video frames by;
detecting one or more instances of a candidate feature in one or more of the video frames;
said detection determining positional information for instances in the one or more previously decoded video frames, the positional information including a frame number, a position within that frame, and a spatial perimeter of the instance;
said candidate feature being a set of one or more detected instances;
predicting, by a motion compensated prediction process, a portion of a current video frame in the series using one or more previously decoded video frames;
said motion compensated prediction process being initialized with positional predictions, where the positional predictions provide the positional information from detected feature instances in previously decoded video frames;
using one or more of the candidate feature instances that are transformed by augmenting the motion compensated prediction process, defining one or more features along with the transformed instances to create a first feature-based model, the first feature-based model enabling prediction in the current frame of an appearance and a source position of a substantially matching feature instance, where the substantially matching feature instance is a key feature instance;
comparing the first feature-based model to a conventional video encoding model of the one or more defined features, and determining from the comparison which model enables greater encoding compression; and
using results of the comparing and determining step, applying feature-based encoding to portions of one or more of the video frames, and applying conventional video encoding to other portions of the one or more video frames.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods of processing video data are provided. Video data having a series of video frames is received and processed. One or more instances of a candidate feature are detected in the video frames. The previously decoded video frames are processed to identify potential matches of the candidate feature. When a substantial amount of portions of previously decoded video frames include instances of the candidate feature, the instances of the candidate feature are aggregated into a set. The candidate feature set is used to create a feature-based model. The feature-based model includes a model of deformation variation and a model of appearance variation of instances of the candidate feature. The feature-based model compression efficiency is compared with the conventional video compression efficiency.
-
Citations
35 Claims
-
1. A computer method of processing video data comprising the computer implemented steps of:
-
receiving video data formed of a series of video frames; and encoding portions of the video frames by; detecting one or more instances of a candidate feature in one or more of the video frames; said detection determining positional information for instances in the one or more previously decoded video frames, the positional information including a frame number, a position within that frame, and a spatial perimeter of the instance; said candidate feature being a set of one or more detected instances; predicting, by a motion compensated prediction process, a portion of a current video frame in the series using one or more previously decoded video frames; said motion compensated prediction process being initialized with positional predictions, where the positional predictions provide the positional information from detected feature instances in previously decoded video frames; using one or more of the candidate feature instances that are transformed by augmenting the motion compensated prediction process, defining one or more features along with the transformed instances to create a first feature-based model, the first feature-based model enabling prediction in the current frame of an appearance and a source position of a substantially matching feature instance, where the substantially matching feature instance is a key feature instance; comparing the first feature-based model to a conventional video encoding model of the one or more defined features, and determining from the comparison which model enables greater encoding compression; and using results of the comparing and determining step, applying feature-based encoding to portions of one or more of the video frames, and applying conventional video encoding to other portions of the one or more video frames. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A digital processing system for processing video data having one or more video frames comprising:
-
one or more computer processors executing an encoder; the encoder using feature-based encoding to encode portions of the video frames by; detecting one or more instances of a candidate feature in one or more of the video frames; using a motion compensated prediction process, segmenting the one or more instances of the candidate feature from non-features in the one or more video frames, the motion compensated prediction process selecting previously decoded video frames having features corresponding to the one or more instances of the candidate feature; defining one or more feature instances using one or more of the instances of the candidate feature, where the one or more defined feature instances are predicted to provide relatively increased compactness in the feature-based encoding relative to conventional video encoding; determining positional information from the one or more previously decoded video frames, the positional information including a position and a spatial perimeter of the one or more defined feature instances in the one or more previously decoded video frames; forming a feature-based model using the one or more defined feature instances, the feature-based model including the positional information from the previously decoded video frames; normalizing the one or more defined feature instances using the feature-based model, said normalizing using the positional information from the one or more previously decoded video frames as a positional prediction, resulting normalization being prediction of the one or more defined feature instances in the current video frame; comparing the feature-based model to a conventional video encoding model for one or more of the defined features, and determining from the comparison which model enables greater encoding compression; and using results of the comparing and determining step, applying feature-based encoding to portions of one or more of the video frames, and applying conventional video encoding to other portions of the one or more video frames.
-
-
32. A method of processing video data comprising:
-
receiving video data having a series of video frames; detecting a candidate feature in one or more of the video frames; segmenting the candidate feature from non-features in the video frame by employing reference frame processing used in a motion compensated prediction process; processing the one or more portions of previously decoded video frames to identify potential matches of the candidate feature; determining that a substantial amount of the portions of previously decoded video frames include instances of the candidate feature; aggregating the instances of the candidate feature into a set of instances of the candidate feature; processing the candidate feature set to create a feature-based model, where the feature-based model includes a model of deformation variation and a model of appearance variation of the instances of the candidate feature, the appearance variation models being created by modeling pel variation of the instances of the candidate feature, the deformation variation models being created by modeling pel correspondence variation of the instances of the candidate feature; determining compression efficiency associated with using the feature-based model to model the candidate feature set; determining compression efficiency associated with using conventional video compression to model the candidate feature set; comparing the feature-based model compression efficiency with the conventional video modeling compression efficiency, and determining which one is of greater compression value; encoding the video data using the feature-based models and conventional video encoding based on which one is of greater compression value.
-
-
33. A digital processing system for processing video data having one or more video frames comprising:
-
one or more computer processors executing an encoder; the encoder using feature-based encoding to encode portions of the video frames by; detecting a candidate feature in one or more of the video frames; segmenting the candidate feature from non-features in the video frame by employing reference frame processing used in a motion compensated prediction process; processing the one or more portions of previously decoded video frames to identify potential matches of the candidate feature; determining that a substantial amount of the portions of previously decoded video frames include instances of the candidate feature; aggregating the instances of the candidate feature into a set of instances of the candidate feature; processing the candidate feature set to create a feature-based model, where the feature-based model includes a model of deformation variation and a model of appearance variation of the instances of the candidate feature, the appearance variation models being created by modeling pel variation of the instances of the candidate feature, the structural variation models being created by modeling pel correspondence variation of the instances of the candidate feature; determining compression efficiency associated with using the feature-based model to model the candidate feature set; determining compression efficiency associated with using conventional video compression to model the candidate feature set; comparing the feature-based model compression efficiency with the conventional video modeling compression efficiency, and determining which one is of greater compression value; encoding the video data using the feature-based models and conventional video encoding based on which one is of greater compression value.
-
-
34. A method of processing video data comprising:
-
decoding encoded video data by determining on a macroblock level whether there is an encoded feature in the encoded video data; in response to determining that there is no encoded feature in the encoded video data, decoding using conventional video decoding; in response to determining that there is an encoded feature in the encoded video data, separating the encoded feature from the encoded video data in order to synthesize the encoded feature instance separately from the conventionally encoded portions of the video data; determining feature-based models and feature parameters associated with the encoded feature; using the determined feature-based models and feature parameters to synthesize the encoded feature instance; and combining conventionally encoded portions of the video data with the synthesized feature instances to reconstruct original video data.
-
-
35. A data processing system for processing video data comprising:
one or more computer processors executing a hybrid codec decoder capable of using video data decoding by; decoding an encoded video data by determining on a macroblock level whether there is an encoded feature in the encoded video data; in response to determining that there is no encoded feature in the encoded video data, decoding using conventional video decoding; in response to determining that there is an encoded feature in the encoded video data, separating the encoded feature from the encoded video data in order to synthesize the encoded feature instance separately from the conventionally encoded portions of the video data; determining feature-based models and feature parameters associated with the encoded feature; using the determined feature-based models and feature parameters to synthesize the encoded feature instance; and combining conventionally encoded portions of the video data with the synthesized features of the video data to reconstruct an original video data.
Specification