2-D transforms for image and video coding
First Claim
1. A method of encoding media data, comprising:
- for a two dimensional block of the media data, performing a forward transform of the block to convert the block into a transform domain, quantizing the transform-domain block;
dequantizing the transform-domain block; and
performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint.
2 Assignments
0 Petitions
Accused Products
Abstract
A set of one and two-dimensional transforms is constructed subject to certain range limited constraints to provide a computationally efficient transform implementation, such as for use in image and video coding. The constraints can include that the transform has a scaled integer implementation, provides perfect or near perfect reconstruction, has a DCT-like basis, is limited to coefficient within a range for representation in n-bits (e.g., n is 16 bits), has basis functions that are close in norm, and provides sufficient headroom for overflow of the range. A set of transforms is constructed with this procedure having an implementation within a 16-bit integer range for efficient computation using integer matrix multiplication operations.
122 Citations
50 Claims
-
1. A method of encoding media data, comprising:
-
for a two dimensional block of the media data, performing a forward transform of the block to convert the block into a transform domain, quantizing the transform-domain block;
dequantizing the transform-domain block; and
performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A media system providing transform coding of a media data, comprising:
-
a forward transform stage operating, for a two dimensional block of the media data, to perform a forward transform of the block to convert the block into a transform domain, a quantization stage operating to quantize the transform-domain block;
a dequantization stage operating to dequantize the transform-domain block; and
an inverse transform stage for performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint.
-
-
8. A computer readable storage medium having a computer-executable program instructions stored thereon operative upon execution on a computer system to perform a method of encoding media data, comprising:
-
for a two dimensional block of the media data, performing a forward transform of the block to convert the block into a transform domain, quantizing the transform-domain block;
dequantizing the transform-domain block; and
performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint.
-
-
9. A method of decoding media data encoded as a block of quantized, transform domain values, comprising:
-
dequantizing the transform-domain block; and
performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A media decoder for decoding data encoded as a block of quantized, transform domain values, comprising:
-
a dequantization stage for dequantizing the transform-domain block; and
an inverse transform stage for performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint.
-
-
15. A computer readable storage medium having computer-executable program instructions stored thereon operative upon execution on a computer system to perform a method of decoding media data encoded as a block of quantized, transform domain values, comprising:
-
dequantizing the transform-domain block; and
performing an inverse transform of the transform-domain block to produce a reconstructed block, the inverse transform being implemented as a sequence of matrix multiplications by a transform matrix composed of integer numbers conforming within a predetermined tolerance to certain constraints, the constraints comprising a scaled integer constraint, a perfect reconstruction constraint, a DCT-like basis constraint, and an integer range limitation constraint.
-
-
16. A method of converting a two-dimensional block of image data between spatial and transform domain representations, where at least one dimension of the block is 8 points, comprising:
-
performing at least one matrix multiplication of the image data block with a transform matrix composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit. - View Dependent Claims (17, 18, 19, 20)
-
-
21. A computer readable storage medium having computer executable program instructions stored thereon for execution on a computer to perform a method of converting a two-dimensional block of image data between spatial and transform domain representations, where at least one dimension of the block is 8 points, comprising:
-
performing at least one matrix multiplication of the image data block with a transform matrix composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit.
-
-
22. A method of converting a two-dimensional block of image data between spatial and transform domain representations, where at least one dimension of the block is 4 points, comprising:
-
performing at least one matrix multiplication of the image data block with a transform matrix composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit. - View Dependent Claims (23, 24, 25, 26)
-
-
27. A computer readable storage medium having computer-executable program instructions stored thereon for execution on computer system to perform a method of converting a two-dimensional block of image data between spatial and transform domain representations, where at least one dimension of the block is 4 points, comprising:
-
performing at least one matrix multiplication of the image data block with a transform matrix composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit.
-
-
28. A method of converting a two-dimensional block of image data between spatial and transform domain representations, where dimensions of the block are 4 and 8 points, the method comprising:
-
performing row-wise and column-wise matrix multiplications of the image data block with transform matrices composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit. - View Dependent Claims (29, 30)
-
-
31. A computer readable storage medium having computer-executable program instructions stored thereon for execution on computer system to perform a method of converting a two-dimensional block of image data between spatial and transform doamain representations, where dimensions of the block are 4 and 8 points, the method comprising:
-
performing row-wise and column-wise matrix multiplications of the image data block with transform matrices composed of integer transform coefficients in the form, scaling the resulting matrix product to remain within a bit-range limit.
-
-
32. A transform coder for 2-dimensional media data utilizing a transform implemented as a matrix multiplication of a data block of integer values with a transform matrix composed of integer coefficients that approximate a second transform having non-integer coefficient basis functions, the data block having dimensions n and m, the transform matrix being constructed according to a construction process comprising:
-
determining a constant multiplier of n- and m-point DC basis functions subject within a tolerance to constraints that norms of the n- and m-point DC basis functions match, and basis functions produce transform domain data within an limited range of integer values;
determining a set of constant multipliers of odd basis functions of a larger of n- or m-points subject within the tolerance to constraints that such odd basis functions are orthogonal, correlate well to corresponding basis function constant multipliers of the second transform, and match in norms with the DC basis functions;
determining a set of constant multipliers of even basis functions of the larger of n- or m-points subject within the tolerance to constraints that such even basis functions are orthogonal, correlate well to corresponding basis function constant multipliers of the second transform, and match in norms with the DC and odd basis functions; and
determining a set of multipliers of basis functions of a smaller of n- or m-points also subject within the tolerance to constraints that such basis functions are orthogonal, correlate well to corresponding basis function constant multipliers of the second transform, and match in norms with the larger of n- and m-point basis functions. - View Dependent Claims (33, 34, 35, 36)
-
-
37. A transform decoder for 2-dimensional media data utilizing a transform implemented as a matrix multiplication of a data block of integer values with a transform matrix composed of integer coefficients, the data block having dimensions n and m, the transform matrix being constructed according to a construction process comprising:
-
determining a constant multiplier of n- and m-point DC basis functions subject within a tolerance to constraints that norms of the n- and m-point DC basis functions match, and basis functions produce transform domain data within an limited range of integer values;
determining a set of constant multipliers of odd basis functions of a larger of n- or m-points subject within the tolerance to constraints that such odd basis functions are orthogonal, correlate well to corresponding DCT basis function constant multipliers, and match in norms with the DC basis functions;
determining a set of constant multipliers of even basis functions of the larger of n- or m-points subject within the tolerance to constraints that such even basis functions are orthogonal, correlate well to corresponding DCT basis function constant multipliers, and match in norms with the DC and odd basis functions; and
determining a set of multipliers of basis functions of a smaller of n- or m-points also subject within the tolerance to constraints that such basis functions are orthogonal, correlate well to corresponding DCT basis function constant multipliers, and match in norms with the larger of n- and m-point basis functions. - View Dependent Claims (38, 39)
-
-
40. A method of transform coding a data block representing media content, comprising:
-
applying a transform to the data block to produce a transform domain data block representing the media content, the transform comprising a set of transform basis functions having mismatching, yet approximately equal norms;
scaling values in the transform domain data block according to scaling factors for the respective transform basis functions to compensate for the mismatching norms. - View Dependent Claims (41, 42, 43, 44, 45)
-
-
46. A computer-readable storage medium having computer-executable program instructions stored thereon for executing on a computer system to perform a method of transform coding a data block representing media content, comprising:
-
applying a transform to the data block to produce a transform domain data block representing the media content, the transform comprising a set of transform basis functions having mismatching, yet approximately equal norms;
scaling values in the transform domain data block according to scaling factors for the respective transform basis functions to compensate for the mismatching norms.
-
-
47. A method of transform coding a data block, D, representing media content, comprising:
-
calculating a transform of the data block for converting between spatial and transform domain representations of the media block, wherein a result R of the transform is related to the data block D as where T is a matrix of transform basis functions, the calculating comprising;
performing matrix multiplications of the data block with each of first and second transform sub-component matrices, wherein the transform sub-component matrices (Ta and Tb) are related to the transform basis function matrix as T=2x·
Ta+Tb;
shifting a product of the data block and second sub-component matrix by x bit positions;
summing a product of the data block and first sub-component matrix with the shifted product of the data block and second sub-component matrix; and
shifting a sum of the products by y bit positions to produce the result R;
whereby the headroom of the transform is extended. - View Dependent Claims (48, 49)
-
-
50. A computer-readable storage medium having computer-executable program instructions stored thereon for execution on a computer system to perform a method of transform coding a data block, D, representing media content, comprising:
-
calculating a transform of the data block for converting between spatial and transform domain representations of the media block, wherein a result R of the transform is related to the data block D as where T is a matrix of transform basis functions, the calculating comprising;
performing matrix multiplications of the data block with each of first and second transform sub-component matrices, wherein the transform sub-component matrices (Ta and Tb) are related to the transform basis function matrix as T=2x·
Ta+Tb;
shifting a product of the data block and second sub-component matrix by x bit positions;
summing a product of the data block and first sub-component matrix with the shifted product of the data block and second sub-component matrix; and
shifting a sum of the products by y bit positions to produce the result R;
whereby the headroom of the transform is extended.
-
Specification