Context Based Video Encoding and Decoding

US 20130107948A1
Filed: 12/21/2012
Published: 05/02/2013
Est. Priority Date: 03/31/2005
Status: Active Grant

First Claim

Patent Images

1. A method for processing video data, comprising:

detecting at least one of a feature and an object in the region of interest using a detection algorithm in at least one frame;

modeling the detected at least one of the feature and the object using a set of parameters;

associating any instances of the at least one of the feature and the object across frames;

forming at least one track of the associated instances;

relating the at least one track to at least one specific block of video data to be encoded; and

producing a model-based prediction for the at least one specific block of video data using the related track information, said producing including storing the model-based prediction as processed video data.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A model-based compression codec applies higher-level modeling to produce better predictions than can be found through conventional block-based motion estimation and compensation. Computer-vision-based feature and object detection algorithms identify regions of interest throughout the video datacube. The detected features and objects are modeled with a compact set of parameters, and similar feature/object instances are associated across frames. Associated features/objects are formed into tracks and related to specific blocks of video data to be encoded. The tracking information is used to produce model-based predictions for those blocks of data, enabling more efficient navigation of the prediction search space than is typically achievable through conventional motion estimation methods. A hybrid framework enables modeling of data at multiple fidelities and selects the appropriate level of modeling for each portion of video data.

38 Citations

View as Search Results

32 Claims

1. A method for processing video data, comprising:
- detecting at least one of a feature and an object in the region of interest using a detection algorithm in at least one frame;
  
  modeling the detected at least one of the feature and the object using a set of parameters;
  
  associating any instances of the at least one of the feature and the object across frames;
  
  forming at least one track of the associated instances;
  
  relating the at least one track to at least one specific block of video data to be encoded; and
  
  producing a model-based prediction for the at least one specific block of video data using the related track information, said producing including storing the model-based prediction as processed video data.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 wherein the detection algorithm is of a class of nonparametric feature detection algorithms.
  - 3. The method of claim 1, wherein the set of parameters includes information about the at least one of the feature and the object and is stored in memory.
  - 4. The method of claim 3, wherein the respective parameter of the respective feature includes a feature descriptor vector and a location of the respective feature.
  - 5. The method of claim 4, wherein the respective parameter is generated when the respective feature is detected.
  - 6. The method of claim 1, wherein the at least one specific block of video data is a macroblock, the at least one track relating features to the macroblock.

7. A method for processing video data, comprising:
- detecting at least one of a feature and an object in the region of interest;
  
  modeling the at least one of the feature and the object using a set of parameters;
  
  associating any instances of the at least one of the feature and the object across frames;
  
  forming at least one matrix of the associated instances;
  
  relating the at least one matrix to at least one specific block of video data to be encoded; and
  
  producing a model-based prediction for the at least one specific block of video data using the related matrix information, said producing storing the model-based prediction as processed video data.
- View Dependent Claims (8, 9, 10, 11)
- - 8. The method of claim 7, wherein the set of parameters includes information about the at least one of the feature and the object and is stored in memory.
  - 9. The method of claim 8, wherein the respective parameter of the respective feature includes a feature descriptor vector and a location of the respective feature.
  - 10. The method of claim 9, wherein the respective parameter is generated when the respective feature is detected.
  - 11. The method of claim 7, further comprising:
    - summarizing the at least one matrix using at least one subspace of a certain vector space as a parametric model of the associated at least one of the feature and the object.

12. A codec for processing video data, comprising:
- a feature-based detector configured to identify instances of a feature in at least two video frames, where each identified feature instance includes a plurality of pixels exhibiting data complexity relative to other pixels in the one or more video frames;
  
  a modeler operatively coupled to the feature based detector and configured to create feature-based correspondence models modeling correspondence of the feature instances in two or more video frames; and
  
  a cache configured to prioritize use of the feature-based correspondence models if it is determined that an encoding of the feature instances using the feature-based correspondence models provides improved compression efficiency relative to an encoding of the feature instances using a first video encoding process.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
- - 13. The codec of claim 12, wherein the data complexity is determined when an encoding of the pixels by a conventional video compression technique exceeds a predetermined threshold.
  - 14. The codec of claim 12, wherein the data complexity is determined when a bandwidth amount allocated to encode the feature by conventional video compression technique exceeds a predetermined threshold.
  - 15. The codec of claim 14, wherein the predetermined threshold is at least one of:
    - a preset value, a preset value stored in a database, a value set as the average bandwidth amount allocated for previously encoded features, and a value set as the median bandwidth amount allocated for previously encoded features.
  - 16. The codec of claim 12, wherein the first video encoding process includes a motion compensation prediction process.
  - 17. The codec of claim 12, wherein the prioritization of use is determined by comparison of encoding costs for each potential solution within Competition Mode, a potential solution comprising a tracker, a key prediction motion model, a key prediction sampling scheme, a subtiling scheme, a reconstruction algorithm, and (possibly) a secondary prediction scheme.
  - 18. The codec of claim 17, wherein the prioritization of use of the feature-based modeling initiates a use of that data complexity level of the feature instance as the threshold value, such that if a future feature instance exhibits the same or more data complexity level as the threshold value then the encoder automatically determines to initiate and use feature-based compression on the future feature instance.
  - 19. The codec of claim 12, wherein the feature detector utilizes one of an FPA tracker, an MBC tracker, and a SURF tracker.

20. A codec for processing video data, comprising:
- a feature-based detector to identify an instance of a feature in at least two video frames, an identified feature instance including a plurality of pixels exhibiting data complexity relative to other pixels in at least one of the at least two video frames;
  
  a modeler operatively coupled to the feature-based detector, wherein the modeler creates a feature-based correspondence model modeling correspondence of the respective identified feature instance in the at least two video frames; and
  
  a memory, wherein for a plurality of the feature-based correspondence models, the memory prioritizes use of a respective feature-based correspondence model if an improved compression efficiency of the identified feature instance is determined.
- View Dependent Claims (21)
- - 21. The codec of claim 20, wherein the improved compression efficiency of the identified feature instance is determined by comparing the compression efficiency of the identified feature relative to one of:
    - an encoding of the feature instance using a first video encoding process and a predetermined compression efficiency value stored in a database.

22. A method for processing video data, comprising:
- modeling a feature by vectorizing at least one of a feature pel and a feature descriptor;
  
  identifying similar features by at least one of (a) minimizing means-squared error (MSE) and (b) maximizing inner products between different feature pel vectors or feature descriptors; and
  
  applying a standard motion estimation and compensation algorithm to account for translational motion of the feature, resulting in processed video data.

23. A method for processing video data, comprising:
- implementing a model-based prediction by configuring a codec to encode a target frame;
  
  encoding a macroblock in the target frame using a conventional encoding process;
  
  analyzing the macroblock encoding, wherein the conventional encoding of the macroblock is deemed to be at least one of efficient and inefficient;
  
  wherein if the conventional encoding is deemed inefficient, the encoder is analyzed by generating several predictions for the macroblock based on multiple models, andwherein the evaluation of the several predictions of the macroblock are based on an encoding size; and
  
  ranking the predictions of the macroblock with the conventionally encoded macroblock;
- View Dependent Claims (24, 25, 26, 27, 28, 29)
- - 24. The method of claim 23, wherein the conventional encoding of the macroblock is efficient if an encoding size is less than a predetermined threshold size.
  - 25. The method of claim 23, wherein the conventional encoding of the macroblock is efficient if the target macroblock is a skip macroblock.
  - 26. The method of claim 23, wherein the conventional encoding of the macroblock is inefficient if the encoding size is larger than a threshold.
  - 27. The method of claim 23, wherein if the conventional encoding of the macroblock is deemed inefficient, Competition Mode encodings for the macroblock are generated to compare their relative compression efficiencies.
  - 28. The method of claim 27, wherein the encoding algorithm for Competition Mode includes:
    - subtracting the prediction from the macroblock to generate a residual signal;
      
      transforming the residual signal using an approximation of a 2-D block-based DCT; and
      
      encoding transform coefficients using an entropy encoder.
  - 29. The method of claim 23 wherein the encoder being analyzed by generating several predictions includes generating a composite prediction that sums a primary prediction and a weighted version of a secondary prediction.

30. A method for processing video data, comprising:
- modeling data at multiple fidelities for a model-based compression, the multiple fidelities including at least one of a macroblock level, a macroblock as feature level, a feature level, and an object level,wherein the macroblock level uses a block-based motion estimation and compensation (BBMEC) application to find predictions for each tile from a limited search space in previously decoded reference frames,wherein the macroblock as feature level (i) uses a first BBMEC application identical to the macroblock level to find a first prediction for a target macroblock from a most-recent reference frame, (ii) uses a second BBMEC application to find a second prediction for the first prediction by searching in a second-most-recent frame, and (iii) creates a track for the target macroblock by applying BBMEC applications through progressively older frames,wherein the feature level detects and tracks features independent of the macroblock grid and associates the features with overlapping macroblocks such that feature tracks are used to navigate previously-decoded reference frames to find better matches for the overlapping macroblocks; and
  
  where multiple features overlap a given target macroblock, the feature with greatest overlap is selected to model that target macroblock, andwherein the object level an object encompasses or overlaps multiple macroblocks, a single motion vector can be calculated for all of the macroblocks associated with the object to result in computation and encoding size savings.
- View Dependent Claims (31, 32)
- - 31. The method of claim 30, wherein the multiple fidelities are examined sequentially.
  - 32. The method of claim 30, wherein the multiple fidelities are examined in competition mode.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Euclid Discoveries LLC
Original Assignee
Euclid Discoveries LLC
Inventors
DeForest, Darin, Lee, Nigel, Pizzorni, Renato, Pace, Charles P.

Granted Patent

US 9,578,345 B2
Time in Patent Office

Days
Field of Search
US Class Current

375/240.08
CPC Class Codes

H04N 19/20 using video object coding

H04N 19/503 involving temporal predicti...

Context Based Video Encoding and Decoding

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

38 Citations

32 Claims

Specification

Solutions

Use Cases

Quick Links

Context Based Video Encoding and Decoding

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

38 Citations

32 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links