Image compression and affine transformation for image motion compensation

US 5,970,173 A
Filed: 06/04/1996
Issued: 10/19/1999
Est. Priority Date: 10/05/1995
Status: Expired due to Term

First Claim

Patent Images

1. In a method of encoding, in a compressed format, information within a video image frame sequence having first and second video image frames, a method of determining quantized multi-dimensional motion transformations between corresponding image components of the first and second video image frames, comprising:

determining multi-dimensional affine motion transformations between representations of the corresponding image components on the first and second video image frames; and

quantizing the multi-dimensional affine motion transformations between the corresponding image components, wherein the quantizing step includes for each of the components;

selecting reference pixel coordinates within each component in the second video image frame, including selecting a number of reference pixel coordinates to encode motion for an image component depending on complexity of motion of pixels in the image component, wherein two reference pixel coordinates are encoded for rotation and magnification; and

three reference pixel coordinates are encoded for shear;

applying a multi-dimensional affine motion transformation to the selected reference coordinates within each component to find corresponding pixel coordinates in the first video image frame; and

encoding for transmission or storage the reference pixel coordinates and the relative positions of the corresponding pixel coordinates so that a motion transformation can be derived from the reference pixel coordinates and the corresponding pixel coordinates during decoding operations;

wherein the encoding step includes independently encoding image components using the selected number of reference pixels such that the number of reference pixels encoded per image component vary depending on the complexity of the motion of the pixels within the image component.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A transformation method provides a multi-dimensional affine transformation for representing motion between corresponding image components of successive video image frames. The multi-dimensional affine transformations can represent complex motion that includes any or all of translation, rotation, magnification, and shear. The transformation method of this invention includes determining motion transformations between corresponding pixels in the image components of the first and second video image frames. From the motion transformations between corresponding pixels, multi-dimensional affine motion transformations between the corresponding image components are determined. This transformation method increases the accuracy with which complex motion is represented and results in fewer compression or encoding errors in comparison to conventional methods, thereby increasing compression efficiency.

443 Citations

32 Claims

1. In a method of encoding, in a compressed format, information within a video image frame sequence having first and second video image frames, a method of determining quantized multi-dimensional motion transformations between corresponding image components of the first and second video image frames, comprising:
- determining multi-dimensional affine motion transformations between representations of the corresponding image components on the first and second video image frames; and
  
  quantizing the multi-dimensional affine motion transformations between the corresponding image components, wherein the quantizing step includes for each of the components;
  
  selecting reference pixel coordinates within each component in the second video image frame, including selecting a number of reference pixel coordinates to encode motion for an image component depending on complexity of motion of pixels in the image component, wherein two reference pixel coordinates are encoded for rotation and magnification; and
  
  three reference pixel coordinates are encoded for shear;
  
  applying a multi-dimensional affine motion transformation to the selected reference coordinates within each component to find corresponding pixel coordinates in the first video image frame; and
  
  encoding for transmission or storage the reference pixel coordinates and the relative positions of the corresponding pixel coordinates so that a motion transformation can be derived from the reference pixel coordinates and the corresponding pixel coordinates during decoding operations;
  
  wherein the encoding step includes independently encoding image components using the selected number of reference pixels such that the number of reference pixels encoded per image component vary depending on the complexity of the motion of the pixels within the image component.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 in which the multi-dimensional affine motion transformations between the corresponding image components are represented as affine transformation coefficients and quantizing the multi-dimensional affine motion transformations includes representing the affine transformation coefficients by selected pairs of corresponding pixels in the first and second video image frames.
  - 3. The method of claim 1 in which the corresponding image components include pixel arrays of 32×
    - 32 pixels.
  - 4. The method of claim 1 which further includes generating an error signal, said error signal representing data in the second video image frame not represented by the first video image frame and said quantized affine motion transformations.
  - 5. The method of claim 1 which includes determining multi-dimensional affine motion transformations, and quantizing same, for each of plural different multi-pixel image components of an object identified within the first and second video image frames.

6. In a method of encoding, in a compressed format, information within a video sequence having first and second video image frames, each frame including an arbitrarily-shaped object therein, an improvement comprising:
- (a) identifying a plurality of multi-pixel image components in the first and second frames which encompass the arbitrarily-shaped object;
  
  (b) performing dense motion estimation processes to generate a plurality of dense motion vectors for each of said plural multi-pixel image components in the second frame, said plurality of dense motion vectors representing motion of individual pixels in said multi-pixel image components between the first and second video image frames; and
  
  (c) from said dense motion vectors, determining multi-dimensional motion transformations between the first and second video image frames for each of said plural multi-pixel image components in the second frame;
  
  (d) selecting reference pixel coordinates for the multi-pixel image components in the second frame, including selecting a number of reference pixels to encode motion for an image component depending on complexity of motion of pixels in the image component, wherein two reference pixels are encoded for rotation and magnification; and
  
  three reference pixels are encoded for shear;
  
  (e) applying a multi-dimensional motion transformation to the selected reference coordinates to find corresponding pixel coordinates in the first frame; and
  
  (f) encoding for transmission or storage the reference pixel coordinates and relative positions of the corresponding pixel coordinates so that transform coefficients can be derived from the reference pixel coordinates and the corresponding pixel coordinates during decoding operations;
  
  wherein the encoding seep includes independently encoding image components using the selected number of reference pixels such that the number of reference pixels encoded per image component vary depending on the complexity of the motion of the pixels within the image component.
- View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 7. The method of claim 6 in which a plurality of said image components are square in shape, and comprise exactly 32 by 32 pixels.
  - 8. The method of claim 6 in which step (c) includes performing a singular value decomposition process.
  - 9. The method of claim 6 in which step (a) includes performing segmentation operations.
  - 10. The method of claim 6 in which step (b) includes performing a block match process.
  - 11. The method of claim 6 in which step (b) includes generating motion vectors for at least four pixels within each of said plural corresponding image components.
  - 12. The method of claim 6 which includes performing step (c) only for motion vectors having relatively high confidence.
  - 13. The method of claim 6 which includes representing said affine transformations by specifying locations of at least two pixels in the first image frame, and at least two corresponding pixels in the second image frame, wherein errors associated with truncation of affine transformation coefficients are avoided.
  - 14. The method of claim 13 which includes representing translation, rotation and zoom of said image component by specifying locations of exactly two pixels in the first image frame, and exactly two corresponding pixels in the second image frame.
  - 15. The method of claim 13 which includes representing translation, rotation, zoom, and shear of said image component by specifying locations of exactly three pixels in the first image frame, and exactly three corresponding pixels in the second image frame.
  - 16. The method of claim 15 in which said three pixels define an equilateral triangle in one of said image frames.
  - 17. The method of claim 16 in which one of said three pixels is at a center point of said image component, and the others of said pixels are at corners thereof.
  - 18. The method of claim 13 which includes specifying locations (x,y) of pixels in one of said frames, and specifying offsets (Δ
    - x,Δ
      
      y) to the corresponding pixels in the other of said frames.
  - 19. The method of claim 6 in which the dense motion estimation process includes determining a motion vector for each pair of known-to-correspond pixels in said first and second frames.
  - 20. The method of claim 19 which includes identifying known-to-correspond pixel pairs by reference to a confidence measure.
  - 21. A computer-readable medium having stored thereon a data structure including multi-dimensional motion transformation information produced by the method of claim 6.
  - 22. A computer-readable medium storing computer-executable programming for performing the method of claim 6.

23. In a method of encoding affine transformation data relating to motion of pixels between first and second image frames, an improvement comprising:
- deriving affine coefficients that approximate motion of pixels within a transformation block for each block in a group of transformation blocks of the second image frame;
  
  selecting reference pixel coordinates in each of the transformation blocks of the second image frame, including selecting a number of reference pixel coordinates to encode motion for an image component depending on complexity of motion of pixels in the image component;
  
  transforming the selected reference pixel coordinates for each of the transformation blocks with the derived affine coefficients for the block to find corresponding pixel coordinates in the first video image frame for each of the blocks; and
  
  representing the affine coefficients of the transformation blocks of the second image frame by encoding coordinate data of the reference pixels in each of the transformation blocks and encoding coordinate data of corresponding pixel coordinates of each of the transformation blocks, including converting the coordinate data to integer format, wherein truncation errors associated with representation of the affine coefficients are avoided and the affine coefficients are quantized by encoding the coefficients with the converted pixel coordinate data from which the coefficients can be derived during decoding operationswherein the encoding step includes independently encoding image components using the selected number of reference pixels such that the number of reference pixels encoded per image component vary depending on the complexity of the motion of the pixels within the image component.
- View Dependent Claims (24, 25, 26, 27, 28, 29)
- - 24. The method of claim 23 in which said coordinate data for the corresponding pixels in one of said frames is represented as offsets from coordinates of the reference pixels in the other of said frames.
  - 25. The method of claim 23 in which said reference pixel coordinates comprise exactly two reference pixel locations, wherein the group of motions consisting of:
    - translation, rotation, and magnification can be represented.
  - 26. The method of claim 23 in which said reference pixel coordinates comprise exactly three reference pixel locations, wherein the group of motions consisting of:
    - translation, rotation, magnification, and shear, can be represented.
  - 27. The method of claim 23 in which at least one of said transformation blocks is square in shape, and at least one of said reference pixel coordinates is located at an edge of said transformation block.
  - 28. The method of claim 23 in which at least one of said transformation blocks is square in shape, and at least one of said reference pixel coordinates is located at a corner of said transformation block.
  - 29. The method of claim 23 in which at least one of said transformation blocks is square in shape, and said reference pixel coordinates define an equilateral triangle therein.

30. In a method of encoding, in a compressed format, information within a video image frame sequence having first and second video image frames, a method of determining quantized multi-dimensional motion transformations between corresponding image components of the first and second video image frames, comprising:
- determining multi-dimensional affine motion transformations between representations of the corresponding image components in the first and second video image frames;
  
  quantizing the multi-dimensional affine motion transformations between the corresponding image components, including;
  
  a) selecting reference pixel coordinates in the second video image frame;
  
  b) applying a multi-dimensional affine motion transformation to the selected reference coordinates to find corresponding pixel coordinates in the first video image frame; and
  
  c) encoding for transmission or storage the reference pixel coordinates and relative positions of the corresponding pixel coordinates so that transform coefficients can be derived from the reference pixel coordinates and the corresponding pixel coordinates during decoding operations;
  
  and the method further including;
  
  quantizing the multi-dimensional affine motion transformations between the corresponding image components selectively according to the dimensions of the multi-dimensional affine motion transformations;
  
  selecting a number of reference pixels to encode motion for an image component depending on complexity of motion of pixels in the image component, wherein only one reference pixel is encoded for translation, two reference pixels are encoded for rotation and magnification; and
  
  three reference pixels are encoded for shear; and
  
  independently encoding image components using the selected number of reference pixels such that the number of reference pixels encoded per image component vary depending on the complexity of the motion of the pixels within the image component.
- View Dependent Claims (31)
- - 31. The method of claim 30 in which the multi-dimensional affine motion transformations between the corresponding image components are represented as affine transformation coefficients and quantizing the multi-dimensional affine motion transformations includes encoding the affine-transformation coefficients for transmission or storage of the video image frame sequence by encoding selected pairs of corresponding pixels in the first and second video image frames such that the affine transformation coefficients can be derived from the selected pairs of corresponding pixels during decoding operations performed on the compressed format of the video image frame sequence.

32. In a method of encoding, in a compressed format, information within a video sequence having first and second video image frames, each frame including an arbitrarily-shaped object therein, an improvement comprising:
- (a) identifying a plurality of multi-pixel image components in the first and second frames which encompass the arbitrarily-shaped object;
  
  (b) performing dense motion estimation processes to generate a plurality of dense motion vectors for each of said plural multi-pixel image components in the second frame, said plurality of dense motion vectors representing motion of individual pixels in said multi-pixel image components between the first and second video image frames;
  
  (c) from said dense motion vectors, determining multi-dimensional motion transformations between the first and second video image frames for each of said plural multi-pixel image components in the second frame;
  
  (d) selecting reference pixel coordinates for the multi-pixel image components in the second frame;
  
  (e) applying a multi-dimensional motion transformation to the selected reference coordinates to find corresponding pixel coordinates in the first frame; and
  
  (f) encoding for transmission or storage the reference pixel coordinates and relative positions of the corresponding pixel coordinates so that transform coefficients can be derived from the reference pixel coordinates and the corresponding pixel coordinates during decoding operations;
  
  (g) representing the motion transformations by specifying locations of at least two pixels in the first image frame, and at least two corresponding pixels in the second image frame, wherein errors associated with truncation of transformation coefficients are avoided;
  
  (h) representing said motion transformations selectively according to the dimensions of the multi-dimensional motion transformations;
  
  (i) selecting a number of reference pixels to encode motion for an image component depending on complexity of motion of pixels in the image component, wherein two reference pixels are encoded for rotation and magnification; and
  
  three reference pixels are encoded for shear; and
  
  (j) independently encoding image components using the selected number of reference pixels such that the number of reference pixels encoded per image component vary depending on the complexity of the motion of the pixels within the image component.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Lee, Ming-Chieh, Chen, Wei-ge
Primary Examiner(s)
Au, Amelia
Assistant Examiner(s)
Johnson, Timothy M.

Application Number

US08/658,094
Time in Patent Office

1,232 Days
Field of Search

341/51, 348/402, 348/405, 348/407, 348/416, 348/431, 348/699, 358/430, 382/236, 382/239, 382/251, 382/107, 382/243, 345/436
US Class Current

382/236
CPC Class Codes

G06F 17/153   Multidimensional correlatio...

G06T 2207/10016   Video; Image sequence

G06T 7/223   using block-matching

G06T 9/20   Contour coding, e.g. using ...

G06V 10/7515   Shifting the patterns to ac...

H04N 19/00   Methods or arrangements for...

H04N 19/186   the unit being a colour or ...

H04N 19/20   using video object coding

H04N 19/23   with coding of regions that...

H04N 19/51   Motion estimation or motion...

H04N 19/517   by encoding

H04N 19/537   Motion estimation other tha...

H04N 19/54   using feature points or meshes

H04N 19/543   using regions

H04N 19/563   Motion estimation with padd...

H04N 19/61   in combination with predict...

H04N 19/63   using sub-band based transf...

H04N 19/649   the transform being applied...

Image compression and affine transformation for image motion compensation

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

443 Citations

32 Claims

Specification

Solutions

Use Cases

Quick Links

Image compression and affine transformation for image motion compensation

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

443 Citations

32 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links