Method and apparatus for performing fast discrete cosine transforms and fast inverse discrete cosine transforms using look-up tables
First Claim
1. A method of producing in a computer including a processing unit and a memory an output signal including a plurality of digitized signal samples by performing an Inverse Discrete Cosine Transform of an input signal including groups of N input coefficients, each input coefficient classifiable into one of a plurality of symmetry classes, the method comprising the steps of:
- precomputing for each of N look up tables, a plurality of table values equal in number to a number of possible input coefficient amplitude values times N divided by a number of symmetry classes, each table value corresponding to multiplication of an input coefficient value by a transform basis vector element so as to produce a term in one of the plurality of output signal samples;
storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible input coefficient amplitude values, in the N look up tables containing a plurality of entries;
receiving the plurality of input coefficients into a corresponding plurality of buffers in the memory of the computer;
operating the processing unit to look up in the at least one look up table, entries corresponding to each of the plurality of received input coefficients; and
operating the processing unit to accumulate a plurality of sums of entries looked up corresponding to each output signal sample, the step of accumulating further including the steps ofsumming results segregated by the symmetry classes into which the input coefficients are classified, andperforming at least one vector butterfly operation corresponding to the plurality of symmetry classes, so as to combine segregated results into the output signal;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up.
3 Assignments
0 Petitions
Accused Products
Abstract
A method is provided for performing a fast Discrete Cosine Transform (DCT) and a fast Inverse Discrete Cosine Transform (IDCT) in a software implementation. The method provided exploits symmetries found in both the DCT and IDCT. As a result of the symmetries found in the DCT and IDCT, both transforms may be performed using a combination of look-up tables and butterfly operations, thus employing only a small number of additions and subtractions and no multiplications. Furthermore, there is provided an aspect of the present invention which exploits the excess precision available in current central processing units (CPUs) relative to the precision required by the DCT and IDCT calculations.
29 Citations
31 Claims
-
1. A method of producing in a computer including a processing unit and a memory an output signal including a plurality of digitized signal samples by performing an Inverse Discrete Cosine Transform of an input signal including groups of N input coefficients, each input coefficient classifiable into one of a plurality of symmetry classes, the method comprising the steps of:
-
precomputing for each of N look up tables, a plurality of table values equal in number to a number of possible input coefficient amplitude values times N divided by a number of symmetry classes, each table value corresponding to multiplication of an input coefficient value by a transform basis vector element so as to produce a term in one of the plurality of output signal samples; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible input coefficient amplitude values, in the N look up tables containing a plurality of entries; receiving the plurality of input coefficients into a corresponding plurality of buffers in the memory of the computer; operating the processing unit to look up in the at least one look up table, entries corresponding to each of the plurality of received input coefficients; and operating the processing unit to accumulate a plurality of sums of entries looked up corresponding to each output signal sample, the step of accumulating further including the steps of summing results segregated by the symmetry classes into which the input coefficients are classified, and performing at least one vector butterfly operation corresponding to the plurality of symmetry classes, so as to combine segregated results into the output signal;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A digital signal processing apparatus for performing an inverse discrete cosine transform, comprising:
-
means for receiving a digital input signal including a sequence of input transform coefficient values; a memory containing look up tables having a number of entries equal to a number of possible input transform coefficient values, and corresponding to values of Inverse Discrete Cosine Transform basis functions stored at addresses corresponding to all possible input transform coefficient values; means for outputting a digital output signal including a sequence of output values; a central processing unit operatively connected to the means for receiving, to the memory and to the means for outputting; and a control store of central processing unit instructions connected so as to provide the instructions to the central processing unit, the control store containing instructions to address the memory containing look up tables as a function of input transform coefficient values received by the means for receiving, instructions to sum values obtained by addressing the memory so as to form output values, and instructions to provide the output values to the means for outputting, the instructions to sum values including instructions to sum values to obtain parts of the output values corresponding to a plurality of symmetry classes; and instructions to sum values using at least one vector butterfly to obtain the output values.
-
-
12. A method of producing in a computer including a processing unit, a memory and an input/output system an output signal including a plurality of output coefficients by performing a forward Discrete Cosine Transform of a digitized input signal including groups of N samples, S where N is even, the method comprising the steps of:
-
receiving the groups of N samples through the input/output system into a corresponding plurality of buffers; executing instructions on the processing unit to decompose the received plurality of input signal samples into an odd portion including a plurality of odd amplitude values and an even portion including a plurality of even amplitude values; precomputing for each of N/2 look up tables, a plurality of table values equal in number to a number of possible odd amplitude values, each table value corresponding to multiplication of an odd amplitude value by an intermediate transform basis vector element so as to produce a term in one of the plurality of output coefficients; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible odd amplitude values, in the N/2 look up tables addressable by the plurality of odd amplitude values; executing instructions on the processing unit to look up in the N/2 look up tables, entries corresponding to each of the N/2 odd values; executing instructions on the processing unit to accumulate odd output coefficients as a plurality of sums of entries looked up corresponding to each of the odd output coefficients; executing instructions on the processing unit to compute even output coefficients; and outputting as the output signal a sequence of accumulated odd and computed even output coefficients;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up, wherein the step of operating the processing unit to accumulate further comprises the step of; loading a first plurality of values corresponding to a plurality of output coefficients into distinct locations within a single register having greater precision than the precision of each output coefficient; and performing in a single operation an accumulation of a second plurality of values with the first plurality of values within the single register;
wherebya plurality of output values are accumulated in the single register, simultaneously. - View Dependent Claims (13, 14, 15)
-
-
16. A method of producing in a computer including a processing unit, a memory and an input/output system an output signal including a plurality of output coefficients by performing a forward Discrete Cosine Transform of a digitized input signal including groups of N samples, where N is even, wherein the digital input sample is divisible into a most significant portion and a least significant portion, the method comprising the steps of:
-
receiving the groups of N samples through the input/output system into a corresponding plurality of buffers; executing instructions on the processing unit to decompose the received plurality of input signal samples into an odd portion including a plurality of odd amplitude values and an even portion including a plurality of even amplitude values; precomputing for each of N/2 look up tables, a plurality of table values equal in number to a number of possible odd amplitude values, each table value corresponding to multiplication of an odd amplitude value by an intermediate transform basis vector element so as to produce a term in one of the plurality of output coefficients; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible odd amplitude values, in the N/2 look up tables addressable by the plurality of odd amplitude values; executing instructions on the processing unit to look up in the N/2 look up tables, entries corresponding to each of the N/2 odd values; executing instructions on the processing unit to accumulate odd output coefficients as a plurality of sums of entries looked up corresponding to each of the odd output coefficients; executing instructions on the processing unit to compute even output coefficients; and outputting as the output signal a sequence of accumulated odd and computed even output coefficients;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up, and wherein the step of precomputing further comprises the steps of; precomputing at least one table addressable by the most significant portion of the digital input sample, having entries corresponding to a contribution by the most significant portion of the digital input sample to each output coefficient; and precomputing at least one table addressable by the least significant portion of the digital input sample, having entries corresponding to a contribution by the least significant portion of the digital input sample to each output coefficient; and
whereinthe step of operating the processing unit to look up further comprises the steps of; adding a bias value to the amplitude values when the input sample is signed; and addressing the at least one table addressable by the most significant portion of the input samples only when the input sample exceeds a predetermined threshold.
-
-
17. A method of producing in a computer including a processing unit, a memory and an input/output system an output signal including a plurality of output coefficients by performing a forward Discrete Cosine Transform of a digitized input signal including groups of N samples, where N is even, the method comprising the steps of:
-
receiving the groups of N samples through the input/output system into a corresponding plurality of buffers; executing instructions on the processing unit to decompose the received plurality of input signal samples into an odd portion including a plurality of odd amplitude values and an even portion including a plurality of even amplitude values; precomputing for each of N/2 look up tables, a plurality of table values equal in number to a number of possible odd amplitude values, each table value corresponding to multiplication of an odd amplitude value by an intermediate transform basis vector element so as to produce a term in one of the plurality of output coefficients; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible odd amplitude values, in the N/2 look up tables addressable by the plurality of odd amplitude values; executing instructions on the processing unit to look up in the N/2 look up tables, entries corresponding to each of the N/2 odd values; executing instructions on the processing unit to accumulate odd output coefficients as a plurality of sums of entries looked up corresponding to each of the odd output coefficients; executing instructions on the processing unit to compute even output coefficients; and outputting as the output signal a sequence of accumulated odd and computed even output coefficients;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up, wherein the step of executing instructions on the processing unit to compute further comprises the steps of; looking up in the at least one look-up table, entries corresponding to the plurality of even amplitude values; and accumulating the even output coefficients as a plurality of sums of corresponding table values, wherein the step of operating the processing unit to accumulate further comprises the step of; loading a first plurality of values corresponding to a plurality of output coefficients into distinct locations within a single register having greater precision than the precision of each output coefficient; and performing in a single operation an accumulation of a second plurality of values with the first plurality of values within the single register;
wherebya plurality of output values are accumulated in the single register, simultaneously. - View Dependent Claims (18, 19, 20)
-
-
21. A method of producing in a computer including a processing unit, a memory and an input/output system an output signal including a plurality of output coefficients by performing a forward Discrete Cosine Transform of a digitized input signal including groups of N samples, where N is even, wherein the digital input sample is divisible into a most significant portion and a least significant portion, the method comprising the steps of:
-
receiving the groups of N samples through the input/output system into a corresponding plurality of buffers; executing instructions on the processing unit to decompose the received plurality of input signal samples into an odd portion including a plurality of odd amplitude values and an even portion including a plurality of even amplitude values; precomputing for each of N/2 look up tables, a plurality of table values equal in number to a number of possible odd amplitude values, each table value corresponding to multiplication of an odd amplitude value by an intermediate transform basis vector element so as to produce a term in one of the plurality of output coefficients; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible odd amplitude values, in the N/2 look up tables addressable by the plurality of odd amplitude values; executing instructions on the processing unit to look up in the N/2 look up tables, entries corresponding to each of the N/2 odd values; executing instructions on the processing unit to accumulate odd output coefficients as a plurality of sums of entries looked up corresponding to each of the odd output coefficients; executing instructions on the processing unit to compute even output coefficients; and outputting as the output signal a sequence of accumulated odd and computed even output coefficients;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up, wherein the step of executing instructions on the processing unit to compute further comprises the steps of; looking up in the at least one look-up table, entries corresponding to the plurality of even amplitude values; and accumulating the even output coefficients as a plurality of sums of corresponding table values, and wherein the step of precomputing further comprises the steps of; precomputing at least one table addressable by the most significant portion of the digital input sample, having entries corresponding to a contribution by the most significant portion of the digital input sample to each output coefficient; and precomputing at least one table addressable by the least significant portion of the digital input sample, having entries corresponding to a contribution by the least significant portion of the digital input sample to each output coefficient; and
whereinthe step of operating the processing unit to look up further comprises the steps of; adding a bias value to the amplitude values when the input sample is signed; and addressing the at least one table addressable by the most significant portion of the input samples only when the input sample exceeds a predetermined threshold.
-
-
22. A method of producing in a computer including a processing unit, a memory and an input/output system an output signal including a plurality of output coefficients by performing a forward Discrete Cosine Transform of a digitized input signal including groups of N samples, where N is even, the method comprising the steps of:
-
receiving the groups of N samples through the input/output system into a corresponding plurality of buffers; executing instructions on the processing unit to decompose the received plurality of input signal samples into an odd portion including a plurality of odd amplitude values and an even portion including a plurality of even amplitude values; precomputing for each of N/2 look up tables, a plurality of table values equal in number to a number of possible odd amplitude values, each table value corresponding to multiplication of an odd amplitude value by an intermediate transform basis vector element so as to produce a term in one of the plurality of output coefficients; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible odd amplitude values, in the N/2 look up tables addressable by the plurality of odd amplitude values; executing instructions on the processing unit to look up in the N/2 look up tables, entries corresponding to each of the N/2 odd values; executing instructions on the processing unit to accumulate odd output coefficients as a plurality of sums of entries looked up corresponding to each of the odd output coefficients; executing instructions on the processing unit to compute even output coefficients; and outputting as the output signal a sequence of accumulated odd and computed even output coefficients;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up, wherein N is a multiple of 8, and wherein the step of executing instructions on the processing unit to compute further comprises the steps of; further decomposing the plurality of even values into N/4 even-even values and N/4 even-odd values; performing a butterfly operation on the plurality of N/4 even-even values; and obtaining from N/4 of the look-up tables, entries corresponding to each of the N/4 even-odd values; and forming an even-odd portion of the plurality of output coefficients as a plurality of sums of corresponding table values, wherein the step of operating the processing unit to accumulate further comprises the step of; loading a first plurality of values corresponding to a plurality of output coefficients into distinct locations within a single register having greater precision than the precision of each output coefficient; and performing in a single operation an accumulation of a second plurality of values with the first plurality of values within the single register;
wherebya plurality of output values are accumulated in the single register, simultaneously. - View Dependent Claims (23, 24, 25)
-
-
26. A method of producing in a computer including a processing unit, a memory and an input/output system an output signal including a plurality of output coefficients by performing a forward Discrete Cosine Transform of a digitized input signal including groups of N samples, where N is even, wherein the digital input sample is divisible into a most significant portion and a least significant portion, the method comprising the steps of:
-
receiving the groups of N samples through the input/output system into a corresponding plurality of buffers; executing instructions on the processing unit to decompose the received plurality of input signal samples into an odd portion including a plurality of odd amplitude values and an even portion including a plurality of even amplitude values; precomputing for each of N/2 look up tables, a plurality of table values equal in number to a number of possible odd amplitude values, each table value corresponding to multiplication of an odd amplitude value by an intermediate transform basis vector element so as to produce a term in one of the plurality of output coefficients; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible odd amplitude values, in the N/2 look up tables addressable by the plurality of odd amplitude values; executing instructions on the processing unit to look up in the N/2 look up tables, entries corresponding to each of the N/2 odd values; executing instructions on the processing unit to accumulate odd output coefficients as a plurality of sums of entries looked up corresponding to each of the odd output coefficients; executing instructions on the processing unit to compute even output coefficients; and outputting as the output signal a sequence of accumulated odd and computed even output coefficients;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up, wherein N is a multiple of 8, and wherein the step of executing instructions on the processing unit to compute further comprises the steps of; further decomposing the plurality of even values into N/4 even-even values and N/4 even-odd values; performing a butterfly operation on the plurality of N/4 even-even values; and obtaining from N/4 of the look-up tables, entries corresponding to each of the N/4 even-odd values; and forming an even-odd portion of the plurality of output coefficients as a plurality of sums of corresponding table values, and wherein the step of precomputing further comprises the steps of; precomputing at least one table addressable by the most significant portion of the digital input sample, having entries corresponding to a contribution by the most significant portion of the digital input sample to each output coefficient; and precomputing at least one table addressable by the least significant portion of the digital input sample, having entries corresponding to a contribution by the least significant portion of the digital input sample to each output coefficient; and
whereinthe step of operating the processing unit to look up further comprises the steps of; adding a bias value to the amplitude values when the input sample is signed; and addressing the at least one table addressable by the most significant portion of the input samples only when the input sample exceeds a predetermined threshold.
-
-
27. A method of producing in a computer including a processing unit, a memory and an input/output system an output signal including a plurality of output coefficients by performing a forward Discrete Cosine Transform of a digitized input signal including groups of N samples, where N is even, the method comprising the steps of:
-
receiving the groups of N samples through the input/output system into a corresponding plurality of buffers; executing instructions on the processing unit to decompose the received plurality of input signal samples into an odd portion including a plurality of odd amplitude values and an even portion including a plurality of even amplitude values; precomputing for each of N/2 look up tables, a plurality of table values equal in number to a number of possible odd amplitude values, each table value corresponding to multiplication of an odd amplitude value by an intermediate transform basis vector element so as to produce a term in one of the plurality of output coefficients; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible odd amplitude values, in the N/2 look up tables addressable by the plurality of odd amplitude values; executing instructions on the processing unit to look up in the N/2 look up tables, entries corresponding to each of the N/2 odd values; executing instructions on the processing unit to accumulate odd output coefficients as a plurality of sums of entries looked up corresponding to each of the odd output coefficients; executing instructions on the processing unit to compute even output coefficients; and outputting as the output signal a sequence of accumulated odd and computed even output coefficients;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up, wherein the step of executing instructions on the processing unit to decompose further comprises performing a butterfly operation on the plurality of input signal samples, wherein the step of operating the processing unit to accumulate further comprises the step of; loading a first plurality of values corresponding to a plurality of output coefficients into distinct locations within a single register having greater precision than the precision of each output coefficient; and performing in a single operation an accumulation of a second plurality of values with the first plurality of values within the single register;
wherebya plurality of output values are accumulated in the single register, simultaneously. - View Dependent Claims (28, 29, 30)
-
-
31. A method of producing in a computer including a processing unit, a memory and an input/output system an output signal including a plurality of output coefficients by performing a forward Discrete Cosine Transform of a digitized input signal including groups of N samples, where N is even, wherein the digital input sample is divisible into a most significant portion and a least significant portion, the method comprising the steps of:
-
receiving the groups of N samples through the input/output system into a corresponding plurality of buffers; executing instructions on the processing unit to decompose the received plurality of input signal samples into an odd portion including a plurality of odd amplitude values and an even portion including a plurality of even amplitude values; precomputing for each of N/2 look up tables, a plurality of table values equal in number to a number of possible odd amplitude values, each table value corresponding to multiplication of an odd amplitude value by an intermediate transform basis vector element so as to produce a term in one of the plurality of output coefficients; storing the precomputed plurality of table values in the memory of the computer at addresses corresponding to the possible odd amplitude values, in the N/2 look up tables addressable by the plurality of odd amplitude values; executing instructions on the processing unit to look up in the N/2 look up tables, entries corresponding to each of the N/2 odd values; executing instructions on the processing unit to accumulate odd output coefficients as a plurality of sums of entries looked up corresponding to each of the odd output coefficients; executing instructions on the processing unit to compute even output coefficients; and outputting as the output signal a sequence of accumulated odd and computed even output coefficients;
whereinthe step of precomputing is performed prior to all occurrences of the step of operating the processing unit to look up, wherein the step of executing instructions on the processing unit to decompose further comprises performing a butterfly operation on the plurality of input signal samples, and wherein the step of precomputing further comprises the steps of; precomputing at least one table addressable by the most significant portion of the digital input sample, having entries corresponding to a contribution by the most significant portion of the digital input sample to each output coefficient; and precomputing at least one table addressable by the least significant portion of the digital input sample, having entries corresponding to a contribution by the least significant portion of the digital input sample to each output coefficient; and
whereinthe step of operating the processing unit to look up further comprises the steps of; adding a bias value to the amplitude values when the input sample is signed; and addressing the at least one table addressable by the most significant portion of the input samples only when the input sample exceeds a predetermined threshold.
-
Specification