2-D transforms for image and video coding
First Claim
1. A method of encoding media data, comprising:
- for a two dimensional block of the media data, performing a forward transform of the block to convert the block into a transform domain,quantizing the transform-domain block;
dequantizing the transform-domain block; and
performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint, wherein the constraints also comprise a constraint that basis functions of the transform are close in norm, and a constraint that there be sufficient headroom.
2 Assignments
0 Petitions
Accused Products
Abstract
A set of one and two-dimensional transforms is constructed subject to certain range limited constraints to provide a computationally efficient transform implementation, such as for use in image and video coding. The constraints can include that the transform has a scaled integer implementation, provides perfect or near perfect reconstruction, has a DCT-like basis, is limited to coefficient within a range for representation in n-bits (e.g., n is 16 bits), has basis functions that are close in norm, and provides sufficient headroom for overflow of the range. A set of transforms is constructed with this procedure having an implementation within a 16-bit integer range for efficient computation using integer matrix multiplication operations.
76 Citations
34 Claims
-
1. A method of encoding media data, comprising:
-
for a two dimensional block of the media data, performing a forward transform of the block to convert the block into a transform domain, quantizing the transform-domain block; dequantizing the transform-domain block; and performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint, wherein the constraints also comprise a constraint that basis functions of the transform are close in norm, and a constraint that there be sufficient headroom. - View Dependent Claims (2, 3, 4)
-
-
5. A media system providing transform coding of a media data, comprising:
-
a forward transform stage operating, for a two dimensional block of the media data, to perform a forward transform of the block to convert the block into a transform domain, a quantization stage operating to quantize the transform-domain block; a dequantization stage operating to dequantize the transform-domain block; and an inverse transform stage for performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint, wherein the constraints also comprise a constraint that basis functions of the transform are close in norm, and a constraint that there be sufficient headroom.
-
-
6. A computer readable storage medium having a computer-executable program instructions stored thereon operative upon execution on a computer system to perform a method of encoding media data, comprising:
-
for a two dimensional block of the media data, performing a forward transform of the block to convert the block into a transform domain, quantizing the transform-domain block; dequantizing the transform-domain block; and performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint, wherein the constraints also comprise a constraint that basis functions of the transform are close in norm, and a constraint that there be sufficient headroom.
-
-
7. A method of decoding media data encoded as a block of quantized, transform domain values comprising:
-
dequantizing the transform-domain block; and performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint, wherein the constraints also comprise a close norms constraint and a sufficient headroom constraint. - View Dependent Claims (8, 9, 10)
-
-
11. A media decoder for decoding data encoded as a block of quantized, transform domain values, comprising:
-
a dequantization stage for dequantizing the transform-domain block; and an inverse transform stage for performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint, wherein the constraints also comprise a close norms constraint and a sufficient headroom constraint.
-
-
12. A computer readable storage medium having computer-executable program instructions stored thereon operative upon execution on a computer system to perform a method of decoding media data encoded as a block of quantized, transform domain values, comprising:
-
dequantizing the transform-domain block; and performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint, wherein the constraints also comprise a close norms constraint and a sufficient headroom constraint.
-
-
13. A method of converting a two-dimensional block of image data between spatial and transform domain representations, where at least one dimension of the block is 8 points, comprising:
-
performing at least one matrix multiplication of the image data block with a transform matrix composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit; wherein the image data block is an 8×
8 block and the performing at least one matrix multiplication comprises performing row-wise and column-wise matrix multiplications of the image data block with the transform matrix, and wherein the scaling comprises an entry-wise shift operation after each of the row-wise and column-wise matrix multiplications to effect division by a power of two.- View Dependent Claims (14, 15)
-
-
16. A computer readable storage medium having computer executable program instructions stored thereon for execution on a computer to perform a method of converting a two-dimensional block of image data between spatial and transform domain representations, where at least one dimension of the block is 8 points, comprising:
-
performing at least one matrix multiplication of the image data block with a transform matrix composed of integer transform coefficients in the form, sealing the resulting matrix product to remain within a bit-range limit; wherein the image data block is an 8×
8 block and the performing at least one matrix multiplication comprises performing row-wise and column-wise matrix multiplications of the image data block with the transform matrix, and wherein the scaling comprises an entry-wise shift operation after each of the row-wise and column-wise matrix multiplications to effect division by a power of two.
-
-
17. A method of converting a two-dimensional block of image data between spatial and transform domain representations, where at least one dimension of the block is 4 points, comprising:
-
performing at least one matrix multiplication of the image data block with a transform matrix composed of integer transform coefficient in the form, scaling the resulting matrix product to remain within a bit-range limit; wherein the image data block is an 4×
4 block and the performing at least one matrix multiplication comprises performing row-wise and column-wise matrix multiplications of the image data block with the transform matrix, and wherein the scaling comprises an entry-wise shift operation after each of the row-wise and column-wise matrix multiplications to effect division by a power of two.- View Dependent Claims (18, 19)
-
-
20. A computer readable storage medium having computer-executable program instructions stored thereon for execution on computer system to perform a method of converting a two-dimensional block of image data between spatial and transform domain representations, where at least one dimension of the block is 4 points, comprising:
-
performing at least one matrix multiplication of the image data block with a transform matrix composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit; wherein the image data block is an 4×
4 block and the performing at least one matrix multiplication comprises performing row-wise and column-wise matrix multiplications of the image data block with the transform matrix, and wherein the scaling composes an entry-wise shift operation after each of the row-wise and column-wise matrix multiplications to effect division by a power of two.
-
-
21. A computer readable storage medium having computer-executable program instructions stored thereon for execution on computer system to perform a method of converting a two-dimensional block of image data between spatial and transform domain representations, where dimensions of the block are 4 and 8 points, the method comprising:
-
performing row-wise and column-wise matrix multiplications of the image data block with transform matrices composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit; wherein the data block has dimensions of 4×
8 points, and the act of performing the matrix multiplications is performed according to the relation, Y=(T8•
X•
T′
4), where X represents the data block and Y is the resulting matrix product.
-
-
22. A method of converting a two-dimensional block of image data between spatial and transform domain representations, where dimensions of the block are 4 and 8 points, the method comprising:
-
performing row-wise and column-wise matrix multiplications of the image data block with transform matrices composed of interger transform coefficients in form, scaling the resulting matrix product to remain within a bit-range limit; wherein the data block has dimensions of 4×
8 points, and the act of performing the matrix multiplications is performed according to the relation, Y=(T8•
X•
T′
4), where X represents the data block and Y is the resulting matrix product.
-
-
23. A method of converting a two-dimensional block of image data between spatial and transform domain representations, where dimensions of the block are 4 and 8 points, the method comprising:
-
performing row-wise and column-wise matrix multiplications of the image data block with transform matrices composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit; wherein the data block has dimensions of 8×
4 points, and the act of performing the matrix multiplications is performed according to the relation, Y=(T4•
X•
T′
8), where X represents the data block and Y is the resulting matrix product.
-
-
24. A computer readable storage medium having computer-executable program instructions stored thereon for execution on computer system to perform a method of converting a two-dimensional block of image data between spatial and transform domain representations, where dimensions of the block are 4 and 8 points, the method comprising:
-
performing row-wise and column-wise matrix multiplications of the image data block with transform matrices composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit; wherein the data block has dimensions of 8×
4 points, and the act of performing the matrix multiplications is performed according to the relation, Y=(T4•
X•
T′
8), where X represents the data block and Y is the resulting matrix product.
-
-
25. A method of transform coding a data block representing media content, comprising:
-
applying a transform to the data block to produce a transform domain data block representing the media content, the transform comprising a set of transform basis functions having mismatching, yet approximately equal norms; and sealing values in the transform domain data block according to scaling factors for the respective transform basis functions to compensate for the mismatching norms; wherein applying the transform comprises a matrix multiplication using a matrix, wherein scaling comprises performing a component-wise product using a matrix, - View Dependent Claims (26)
-
-
27. A method of transform coding a data block representing media content, comprising:
-
applying a transform to the data block to produce a transform domain data block representing the media content, the transform comprising a set of transform basis functions having mismatching, yet approximately equal norms; and scaling values in the transform domain data block according to scaling factor for the respective transform basis functions to compensate for the mismatching norms; wherein applying the transform comprises a matrix multiplication using a matrix, wherein scaling comprises calculating a component-wise product using a matrix, - View Dependent Claims (28)
-
-
29. A method of transform coding a data block representing media content, comprising:
-
applying a transform to the data block to produce a transform domain data block representing the media content, the transform comprising a set of transform basis functions having mismatching, yet approximately equal norms; and scaling values in the transform domain data block according to scaling factors for the respective transform basis functions to compensate for the mismatching norms; wherein applying the transform comprises matrix multiplications using matrices, wherein scaling comprises calculating a component-wise product using a matrix,
-
-
30. A computer-readable storage medium having computer-executable program instructions stored thereon for executing on a computer system to perform a method of transform coding a data block representing media content, comprising:
-
applying a transform to the data block to produce a transform domain data block representing the media content, the transform comprising a set of transform basis functions having mismatching, yet approximately equal norms; and scaling values in the transform domain data block according to scaling factors for the respective transform basis functions to compensate for the mismatching norms; wherein applying the transform comprises a matrix multiplication using a matrix, wherein scaling comprises performing a component-wise product using a matrix,
-
-
31. A method of transform coding a data block, D, representing media content, comprising:
-
calculating a transform of the data block for convening between spatial and transform domain representations of the media block, wherein a result R of the transform is related to the data block D as where T is a matrix of transform basis functions, the calculating comprising; performing matrix multiplications of the data block with each of first and second transform sub-component matrices, wherein the transform sub-component matrices (Ta and Tb) are related to the transform basis function matrix as T=2x•
Ta+Tb;shifting a product of the data block and second sub-component matrix by x bit positions; summing a product of the data block and first sub-component matrix with the shifted product of the data block and second sub-component matrix; and shifting a sum of the products by y bit positions to produce the result R; whereby the headroom of the transform is extended. - View Dependent Claims (32, 33)
wherein the transform sub-component matrices are
-
-
33. The method of claim 31 wherein the transform basis function matrix is
-
and wherein the transform sub-component matrices are
-
-
34. A computer-readable storage medium having computer-executable program instructions stored thereon for execution on a computer system to perform a method of transform coding a data block, D, representing media content, comprising:
-
calculating a transform of the data block for converting between spatial and transform domain representations of the media block, wherein a result R of the transform is related to the data block D as where T is a matrix of transform basis functions, the calculating comprising; performing matrix multiplications of the data block with each of first and second transform sub-component matrices, wherein the transform sub-component matrices (Ta and Tb) are related to the transform basis function matrix as T=2x•
Ta+Tb;shifting a product of the data block and second sub-component matrix by x bit positions; summing a product of the data block and first sub-component matrix with the shifted product of the data block and second sub-component matrix; and shifting a sum of the products by y bit positions to produce the result R; whereby the headroom of the transform is extended.
-
Specification