Object-based video compression process employing arbitrarily-shaped features

US 5,933,535 A
Filed: 06/04/1996
Issued: 08/03/1999
Est. Priority Date: 10/05/1995
Status: Expired due to Term

First Claim

Patent Images

1. A method for encoding a sequence of video image frames, each frame including at least one arbitrarily shaped video object, the method comprising:

encoding video objects in each frame separately, where at least one of the objects is segmented from the frames in the video sequence and includes a mask for each of the frames defining the shape of the object in each frame, a composite bitmap formed from a combination of pixels of the object in the frames such that the composite bitmap includes portions of the object that are not visible in some of the frames, and trajectories for each frame describing a motion transform of the object for each frame used to transform the composite bitmap to a position in corresponding frames of the video sequence;

computing error signals for the object, including;

a) dividing the object into blocks of pixel locations, where at least some of the blocks overlap a boundary of the object;

b) for each block, computing motion parameters that estimate the motion between a current frame in the sequence and a previously reconstructed object from a previous frame, where the motion parameters are computed separately from the trajectories,c) computing a predicted object for the current frame by applying the motion parameters for each block to the previously reconstructed object;

d) transforming the mask associated with the object for the previous frame to the current frame using the trajectories associated with the current frame;

e) intersecting the transformed mask with the mask for the current frame to identify at least a first portion of the current mask that is outside the transformed mask, the pixels in the first portion being represented by the composite bitmap;

f) computing a difference between an original object for the current frame and the predicted object to compute error signals for the object;

g) compressing the error signals for the object for the current frame; and

h) repeating steps a-g to compute error signals associated with the object for frames in the video sequence;

wherein a compressed version of the object for the video sequence includes a single composite bitmap for the sequence, trajectories for the frames in the sequence, error signals for the frames in the sequence, and motion parameters for each block of the object for the frames in the sequence.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Video encoding and decoding processes provide compression and decompression of digitized video signals representing display motion in video sequences of multiple image frames. The encoder process utilizes object- or feature-based video compression to improve the accuracy and versatility of encoding interframe motion and intraframe image features. Video information is compressed relative to objects or features of arbitrary configurations, rather than fixed, regular arrays of pixels as in conventional video compression methods. This reduces the error components and thereby improves the compression efficiency and accuracy. The decoder process decompresses the encoded video information to reconstruct the objects or features of arbitrary configurations.

Citations

8 Claims

1. A method for encoding a sequence of video image frames, each frame including at least one arbitrarily shaped video object, the method comprising:
- encoding video objects in each frame separately, where at least one of the objects is segmented from the frames in the video sequence and includes a mask for each of the frames defining the shape of the object in each frame, a composite bitmap formed from a combination of pixels of the object in the frames such that the composite bitmap includes portions of the object that are not visible in some of the frames, and trajectories for each frame describing a motion transform of the object for each frame used to transform the composite bitmap to a position in corresponding frames of the video sequence;
  
  computing error signals for the object, including;
  
  a) dividing the object into blocks of pixel locations, where at least some of the blocks overlap a boundary of the object;
  
  b) for each block, computing motion parameters that estimate the motion between a current frame in the sequence and a previously reconstructed object from a previous frame, where the motion parameters are computed separately from the trajectories,c) computing a predicted object for the current frame by applying the motion parameters for each block to the previously reconstructed object;
  
  d) transforming the mask associated with the object for the previous frame to the current frame using the trajectories associated with the current frame;
  
  e) intersecting the transformed mask with the mask for the current frame to identify at least a first portion of the current mask that is outside the transformed mask, the pixels in the first portion being represented by the composite bitmap;
  
  f) computing a difference between an original object for the current frame and the predicted object to compute error signals for the object;
  
  g) compressing the error signals for the object for the current frame; and
  
  h) repeating steps a-g to compute error signals associated with the object for frames in the video sequence;
  
  wherein a compressed version of the object for the video sequence includes a single composite bitmap for the sequence, trajectories for the frames in the sequence, error signals for the frames in the sequence, and motion parameters for each block of the object for the frames in the sequence.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 wherein the motion parameters comprise affine transform coefficients for each block derived by:
    - computing a motion vector for each pixel in the object that falls within the block;
      
      selecting motion vectors with an error below a predetermined threshold; and
      
      from the selected motion vectors, deriving the affine transform coefficients.
  - 3. The method of claim 1 wherein the previously reconstructed object is a quantized object and further including:
    - transform coding the error signals for each block in the object using a lossy, block based transform coding method;
      
      performing an inverse transform coding of the transform coded error signals for each block to compute quantized error signals for each block;
      
      adding the quantized error signals for each block with the predicted object to compute the quantized object, where the quantized object is then used as the previously reconstructed object for the next frame in the video sequence.
  - 4. The method of claim 3 wherein error signals for the blocks that overlap the boundary of the object are extrapolated such that error signals for the block have a rectangular configuration before performing the transform coding step.
  - 5. A computer readable medium having instructions for performing the steps of claim 1.

6. A method for decoding a sequence of video image frames, each frame including at least one arbitrarily shaped video object, the method comprising:
- decoding video objects in each frame separately, where at least one of the objects is segmented from each of the frames in the video sequence and includes a mask for each of the frames defining the shape of the object in each frame, a composite bitmap formed from a combination of pixels of the object in the frames such that the composite bit map includes portions of the object that are not visible in some of the frames, and trajectories for each frame describing a motion transform of the object for each frame used to transform the composite bitmap to a position in corresponding frames of the video sequence;
  
  decoding error signals for the object for a current frame, including;
  
  a) for each block, decoding motion parameters that estimate the motion between a current frame in the sequence and a previously reconstructed object from a previous frame, where the motion parameters are computed separately from the trajectories,b) computing a predicted object for the current frame by applying the motion parameters for each block to the previously reconstructed object;
  
  c) transforming the mask associated with the object for the previous frame to the current frame using the trajectories associated with the current frame;
  
  e) intersecting the transformed mask with the mask for the current frame to identify at least a first portion of the current mask that is outside the transformed mask, the pixels in the first portion being represented by the composite bitmap;
  
  f) decompressing the error signals for the object for the current frame;
  
  g) adding the decompressed error signals for the object for the current frame to the predicted object to compute a reconstructed object for the current frame; and
  
  h) repeating steps a-g to reconstruct the object for frames in the video sequencewherein a compressed version of the object for the video sequence includes a single composite bitmap for the sequence, trajectories for the frames in the sequence, error signals for the frames in the sequence, and motion parameters for each block of the object for the frames in the sequence.
- View Dependent Claims (7)
- - 7. A computer readable medium having instructions for performing the steps of claim 6.

8. A computer readable medium having a data structure representing a compressed sequence of video frames comprising:
- separately encoded video objects, where at least one of the objects is segmented from each of the frames in the video sequence and includes a mask for each of the frames defining the shape of the object in each frame, a composite bitmap formed from a combination of pixels of the object in each frame such that the composite bitmap includes portions of the object that are not visible in some of the frames, and trajectories for each frame describing a motion transform of the object for each frame used to transform the composite bitmap to a position in corresponding frames of the video sequence;
  
  encoded error signals for the object in each of the frames, where the error signals are arranged in an array of blocks of pixel locations that overlap the object in the corresponding frame, the encoded error signals including;
  
  for each block, motion parameters that estimate the motion between a current frame in the sequence and a previously reconstructed object from a previous frame, where the motion parameters are computed separately from the trajectories;
  
  for each block, error signals determined by;
  
  computing a predicted object for a frame by applying the motion parameters for each block to the previously reconstructed object;
  
  computing a difference between an original object for the current frame and the predicted object to compute error signals for the object;
  
  compressing the error signals for each block by using a lossy, transform coding method;
  
  wherein a compressed version of the object for the video sequence includes a single composite bitmap for the sequence, trajectories for each of the frames in the sequence, masks for each of the frames, compressed blocks of error signals for the frames in the sequence, and motion parameters for each block of the object for the frames in the sequence; and
  
  wherein the masks and corresponding trajectories are used to indicate which portion of the object is to be reconstructed from the composite bitmap for a selected frame by transforming a mask of a previously reconstructed frame and intersecting the transformed mask with a mask for the selected frame to identify whether a portion of the mask for the selected frame is outside the transformed mask, the pixels in the portion outside the transformed mask being represented by the composite bitmap.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Lee, Ming-Chieh, Powell, William Chambers III
Primary Examiner(s)
Au, Amelia
Assistant Examiner(s)
Johnson, Timothy M.

Application Number

US08/657,272
Time in Patent Office

1,155 Days
Field of Search

382/236, 382/239, 382/243, 382/241, 382/107, 348/407, 348/699, 348/700, 348/413, 348/416, 348/431, 345/436, 345/474
US Class Current

382/243
CPC Class Codes

G06F 17/153   Multidimensional correlatio...

G06T 2207/10016   Video; Image sequence

G06T 7/223   using block-matching

G06T 9/20   Contour coding, e.g. using ...

G06V 10/7515   Shifting the patterns to ac...

H04N 19/00   Methods or arrangements for...

H04N 19/186   the unit being a colour or ...

H04N 19/20   using video object coding

H04N 19/23   with coding of regions that...

H04N 19/51   Motion estimation or motion...

H04N 19/517   by encoding

H04N 19/537   Motion estimation other tha...

H04N 19/54   using feature points or meshes

H04N 19/543   using regions

H04N 19/563   Motion estimation with padd...

H04N 19/61   in combination with predict...

H04N 19/63   using sub-band based transf...

H04N 19/649   the transform being applied...

Object-based video compression process employing arbitrarily-shaped features

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Object-based video compression process employing arbitrarily-shaped features

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links