STANDARDS-COMPLIANT MODEL-BASED VIDEO ENCODING AND DECODING

US 20130230099A1
Filed: 03/12/2013
Published: 09/05/2013
Est. Priority Date: 07/30/2004
Status: Active Grant

First Claim

Patent Images

1. A method for processing video data, comprising:

receiving multiple frames of video dataforming tracking information by;

detecting at least one of a feature and an object in a region of interest of the video data using a detection algorithm in at least one frame;

modeling the detected at least one of the feature and the object using a set of parameters; and

associating any instances of the detected and modeled at least one of the feature and the object across plural frames of the video data, resulting inat least one track of the associated instances,each track providing tracking information of respective associated instances;

relating the at least one track to at least one specific block of video data to be encoded; and

producing a model-based prediction for the at least one specific block of video data using the tracking information of the at least one related track, the model-based prediction having model-based motion vectors, and said producing including incorporating the model-based motion vectors into a standards-compliant bit stream such that the model-based prediction is stored as standards-compliant encoded video data.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A model-based compression codec applies higher-level modeling to produce better predictions than can be found through conventional block-based motion estimation and compensation. Computer-vision-based feature and object detection algorithms identify regions of interest throughout the video datacube. The detected features and objects are modeled with a compact set of parameters, and similar feature/object instances are associated across frames. Associated features/objects are formed into tracks and related to specific blocks of video data to be encoded. The tracking information is used to produce model-based predictions for those blocks of data, enabling more efficient navigation of the prediction search space than is typically achievable through conventional motion estimation methods. A hybrid framework enables modeling of data at multiple fidelities and selects the appropriate level of modeling for each portion of video data. A compliant-stream version of the model-based compression codec uses the modeling information indirectly to improve compression while producing bitstreams that can be interpreted by standard decoders.

Citations

32 Claims

1. A method for processing video data, comprising:
- receiving multiple frames of video dataforming tracking information by;
  
  detecting at least one of a feature and an object in a region of interest of the video data using a detection algorithm in at least one frame;
  
  modeling the detected at least one of the feature and the object using a set of parameters; and
  
  associating any instances of the detected and modeled at least one of the feature and the object across plural frames of the video data, resulting inat least one track of the associated instances,each track providing tracking information of respective associated instances;
  
  relating the at least one track to at least one specific block of video data to be encoded; and
  
  producing a model-based prediction for the at least one specific block of video data using the tracking information of the at least one related track, the model-based prediction having model-based motion vectors, and said producing including incorporating the model-based motion vectors into a standards-compliant bit stream such that the model-based prediction is stored as standards-compliant encoded video data.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 wherein the detection algorithm is of a class of nonparametric feature detection algorithms.
  - 3. The method of claim 1, wherein the set of parameters includes information about the at least one of the feature and the object and is stored in memory.
  - 4. The method of claim 3, wherein the respective parameter of the respective feature includes a feature descriptor vector and a location of the respective feature.
  - 5. The method of claim 4, wherein the respective parameter is generated when the respective feature is detected.
  - 6. The method of claim 1, wherein the at least one specific block of video data is a macroblock, the at least one track relating features to the macroblock.

7-11. -11. (canceled)

12. A codec for processing video data, comprising:
- a feature-based detector configured to identify instances of a feature in at least two video frames, where each identified feature instance includes a plurality of pixels exhibiting data complexity relative to other pixels in one or more of the at least two video frames;
  
  a modeler operatively coupled to the feature based detector and configured to create feature-based models modeling correspondence of the feature instances in two or more video frames; and
  
  a cache configured to prioritize use of the feature-based models if it is determined that a standards-compliant encoding of associated video data that is derived from the feature-based models provides improved compression efficiency relative to a standards-compliant encoding of the associated video data that uses a first video encoding process.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
- - 13. The codec of claim 12, wherein the data complexity is determined when an encoding of the pixels by a conventional video compression technique exceeds a predetermined threshold.
  - 14. The codec of claim 12, wherein the data complexity is determined when a bandwidth amount allocated to encode the feature by conventional video compression technique exceeds a predetermined threshold.
  - 15. The codec of claim 14, wherein the predetermined threshold is at least one of:
    - a preset value, a preset value stored in a database, a value set as the average bandwidth amount allocated for previously encoded features, and a value set as the median bandwidth amount allocated for previously encoded features.
  - 16. The codec of claim 12, wherein the first video encoding process includes a motion compensation prediction process.
  - 17. The codec of claim 12, wherein the prioritization of use is determined by comparison of encoding costs for each potential solution within Competition Mode, a potential solution comprising a tracker, a primary prediction motion model, a primary prediction sampling scheme, a subtiling scheme for motion vector calculation and a reconstruction algorithm.
  - 18. The codec of claim 17, wherein the prioritization of use of the feature-based modeling initiates a use of that data complexity level of the feature instance as the threshold value, such that if a future feature instance exhibits the same or more data complexity level as the threshold value then the encoder automatically determines to initiate and use feature-based compression on the future feature instance.
  - 19. The codec of claim 12, wherein the feature detector utilizes one of an FPA tracker, an MBC tracker, and a SURF tracker.

20. A codec for processing video data, comprising:
- a feature-based detector to identify an instance of a feature in at least two video frames, an identified feature instance including a plurality of pixels exhibiting data complexity relative to other pixels in at least one of the at least two video frames;
  
  a modeler operatively coupled to the feature-based detector, wherein the modeler creates a feature-based model modeling correspondence of the respective identified feature instance in the at least two video frames; and
  
  a memory, wherein for a plurality of the feature-based models, the memory prioritizes standards compliant use of a respective feature-based model if an improved compression efficiency of associated video data is determined, said standards compliant use of the respective feature-based model including storing model based prediction information in an encoding stream.
- View Dependent Claims (21)
- - 21. The codec of claim 20, wherein the improved compression efficiency of the identified feature instance is determined by comparing the compression efficiency of the identified feature relative to one of:
    - a standards compliant encoding of the feature instance using a first video encoding process and a predetermined compression efficiency value stored in a database.

22. A method for processing video data, comprising:
- modeling a feature by vectorizing at least one of a feature pel and a feature descriptor;
  
  identifying similar features byat least one of (a) minimizing means-squared error (MSE) and (b) maximizing inner products between different feature pel vectors or feature descriptors; and
  
  applying a standard motion estimation and compensation algorithm to account for translational motion of the feature, resulting in identified similar features;
  
  from the identified similar features, producing feature modeling prediction information and deriving motion vectors;
  
  storing the feature modeling prediction information in standards-compliant encoded video data including encoding motion vectors.

23. A method for processing video data, comprising:
- implementing a model-based prediction by configuring a codec to encode a target frame;
  
  encoding a macroblock in the target frame using a conventional encoding process, resulting in a macroblock encoding;
  
  analyzing the macroblock encoding such that the macroblock encoding is deemed to be at least one of efficient and inefficient according to a codec standard;
  
  wherein if the macroblock encoding is deemed inefficient, analyzing candidate standards-compliant model-based encodings of the macroblock by generating several predictions for the macroblock based on multiple models, and applying the generated predictions, resulting in plural candidate standards-compliant model-based encodings of the macroblock,evaluating the resulting candidate standards-compliant model-based encodings of the macroblock according to encoding size; and
  
  ranking the candidate standards-compliant model-based encodings of the macroblock along with the conventionally encoded macroblock.
- View Dependent Claims (24, 25, 26, 27, 28, 29)
- - 24. The method of claim 23, wherein the conventional encoding of the macroblock is efficient if an encoding size is less than a predetermined threshold size.
  - 25. The method of claim 23, wherein the conventional encoding of the macroblock is efficient if the target macroblock is a skip macroblock.
  - 26. The method of claim 23, wherein the conventional encoding of the macroblock is inefficient if the encoding size is larger than a threshold.
  - 27. The method of claim 23, wherein if the conventional encoding of the macroblock is deemed inefficient, Competition Mode encodings for the macroblock are generated to compare their relative compression efficiencies.
  - 28. The method of claim 27, wherein the encoding algorithm for Competition Mode includes:
    - subtracting the prediction from the macroblock to generate a residual signal;
      
      transforming the residual signal using an approximation of a 2-D block-based DCT; and
      
      encoding transform coefficients using an entropy encoder.
  - 29. The method of claim 23 wherein the encoder being analyzed by generating several predictions includes generating a composite prediction that sums a primary prediction and a weighted version of a secondary prediction.

30. A method for processing video data, comprising:
- modeling data at multiple fidelities in a model-based compression, the multiple fidelities including at least one of a macroblock level, a macroblock as feature level, a feature level, and an object level,wherein the macroblock level uses a block-based motion estimation and compensation (BBMEC) application to find predictions for each tile from a limited search space in previously decoded reference frames,wherein the macroblock as feature level (i) uses a first BBMEC application identical to the macroblock level to find a first prediction for a target macroblock from a most-recent reference frame, (ii) uses a second BBMEC application to find a second prediction for the first prediction by searching in a second-most-recent frame, and (iii) creates a track for the target macroblock by applying BBMEC applications through progressively older frames,wherein the feature level detects and tracks features independent of the macroblock grid and associates the features with overlapping macroblocks such that feature tracks are used to navigate previously-decoded reference frames to find better matches for the overlapping macroblocks; and
  
  where multiple features overlap a given target macroblock, the feature with greatest overlap is selected to model that target macroblock, and the feature tracks identifying certain motion vectors, andwherein the object level an object encompasses or overlaps multiple macroblocks, a single motion vector can be calculated for all of the macroblocks associated with the object to result in computation and encoding size savings; and
  
  storing one of model-based prediction information and motion vectors in a standards compliant bit stream resulting in standards compliant encoded video data.
- View Dependent Claims (31, 32)
- - 31. The method of claim 30, wherein the multiple fidelities are examined sequentially.
  - 32. The method of claim 30, wherein the multiple fidelities are examined in competition mode.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Euclid Discoveries LLC
Original Assignee
Euclid Discoveries LLC
Inventors
DeForest, Darin, Pace, Charles P., Lee, Nigel, Pizzorni, Renato

Granted Patent

US 9,743,078 B2
Time in Patent Office

Days
Field of Search
US Class Current

375/240.08
CPC Class Codes

H04N 19/23   with coding of regions that...

H04N 19/50   using predictive coding H04...

H04N 19/51   Motion estimation or motion...

H04N 19/543   using regions

H04N 19/85   using pre-processing or pos...

STANDARDS-COMPLIANT MODEL-BASED VIDEO ENCODING AND DECODING

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

32 Claims

Specification

Solutions

Use Cases

Quick Links

STANDARDS-COMPLIANT MODEL-BASED VIDEO ENCODING AND DECODING

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

32 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links