Feedforward prediction of scalefactors based on allowable distortion for noise shaping in psychoacoustic-based compression
First Claim
1. A method of determining scalefactors used to encode a signal, comprising the steps of:
- associating a plurality of distortion thresholds, respectively, with a plurality of frequency scalefactor bands of the signal;
transforming the signal to yield a plurality of sets of transform coefficients, one set for each of the frequency scalefactor bands; and
calculating a plurality of total scaling values, one for each of the frequency scalefactor bands, such that an anticipated distortion based on the product of a transform coefficient for a given scalefactor band with its respective total scaling value is less than a corresponding one of the distortion thresholds; and
wherein a given total scaling value Asfb for a particular frequency scalefactor band is calculated according to the equation;
Asfb=2[4/(9BWsfb)]2/3*(1/Msfb)2/3*(Σ
xi)1/3,where BWsfb is the bandwidth of the particular frequency scalefactor band, Msfb is the corresponding distortion threshold, and Σ
xj is the sum of all of the transform coefficients for the particular scalefactor band.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of encoding a digital signal, particularly an audio signal, which predicts favorable scalefactors for different frequency subbands of the signal. Distortion thresholds which are associated with each of the frequency subbands of the signal are used, along with transform coefficients, to calculate total scaling values, one for each of the frequency subbands, such that the product of a transform coefficient for a given subband with its respective total scaling value is less than a corresponding one of the distortion thresholds. In an audio encoding application, the distortion thresholds are based on psychoacoustic masking. The invention may use a novel approximation for calculating the total scaling values, which obtains a first term based on a corresponding distortion threshold, and obtains a second term based on a sum of the transform coefficients. Both of these terms may be obtained using lookup tables. The total scaling values can be normalized to yield scalefactors by identifying one of the total scaling values as a minimum nonzero value, and using that minimum nonzero value to carry out normalization. Encoding of the signal further includes the steps of setting a global gain factor to this minimum nonzero value, and quantizing the transform coefficients using the global gain factor and the scalefactors.
-
Citations
30 Claims
-
1. A method of determining scalefactors used to encode a signal, comprising the steps of:
-
associating a plurality of distortion thresholds, respectively, with a plurality of frequency scalefactor bands of the signal; transforming the signal to yield a plurality of sets of transform coefficients, one set for each of the frequency scalefactor bands; and calculating a plurality of total scaling values, one for each of the frequency scalefactor bands, such that an anticipated distortion based on the product of a transform coefficient for a given scalefactor band with its respective total scaling value is less than a corresponding one of the distortion thresholds; and wherein a given total scaling value Asfb for a particular frequency scalefactor band is calculated according to the equation;
Asfb=2[4/(9BWsfb)]2/3*(1/Msfb)2/3*(Σ
xi)1/3,where BWsfb is the bandwidth of the particular frequency scalefactor band, Msfb is the corresponding distortion threshold, and Σ
xj is the sum of all of the transform coefficients for the particular scalefactor band. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of encoding an audio signal, comprising the steps of:
-
identifying a plurality of frequency scalefactor bands of the audio signal; associating a plurality of distortion thresholds, respectively, with the plurality of frequency scalefactor bands of the audio signal, the distortion levels being based on a psychoacoustic mask; transforming the audio signal to yield a plurality of transform coefficients, one for each of the frequency scalefactor bands; calculating a plurality of total scaling values, one for each of the frequency scalefactor bands, based on the distortion thresholds and the transform coefficients; normalizing at least one of the total scaling values using a minimum nonzero one of the total scaling values, to yield a respective plurality of scalefactors, one for each scalefactor band; setting a global gain factor to the minimum nonzero total scaling value; quantizing the transform coefficients using the global gain factor and the scalefactors, to yield an output bit stream; computing a number of bits required from said quantizing step; comparing the number of required bits to a predetermined number of available bits; and packing the output bit stream into a frame; and wherein a given total scaling value Asfb for particular frequency scalefactor band is calculated according to the equation;
Asfb=2[4/(9BWsfb)]2/3*(1/Msfb)2/3*(Σ
xi)1/3,where BWsfb is the bandwidth of the particular frequency scalefactor band, Msfb is the corresponding distortion threshold, and Σ
xi is the sum of all of the transform coefficients for the particular scalefactor band. - View Dependent Claims (11, 12)
-
-
13. A device for encoding a signal, comprising:
-
means for associating a plurality of distortion thresholds, respectively, with a plurality of frequency scalefactor bands of the signal; means for transforming the signal to yield a plurality of transform coefficients, one for each of the frequency scalefactor bands; and means for calculating a plurality of total scaling values, one for each of the frequency scalefactor bands, such that an anticipated distortion based on the product of a transform coefficient for a given scalefactor band with its respective total scaling value is less than a corresponding one of the distortion thresholds; and wherein a given total scaling value Asfb for a particular frequency scalefactor band is calculated according to the equation;
Asfb=2[4/(9BWsfb)]2/3*(1/Msfb)2/3*(Σ
xi)1/3,where BWsfb is the bandwidth of the particular frequency scalefactor band, Msfb is the corresponding distortion threshold, and Σ
xi is the sum of all of the transform coefficients for the particular scalefactor band. - View Dependent Claims (14)
-
-
15. An audio encoder comprising:
-
an input for receiving an audio signal; a psychoacoustic mask providing a plurality of distortion thresholds, respectively, for a plurality of frequency scalefactor bands of the audio signal; a frequency transform which operates on the audio signal to yield a plurality of transform coefficients, one for each of the frequency scalefactor bands; and a quantizer which calculates a plurality of total scaling values, one for each of the frequency scalefactor bands, such that an anticipated distortion based on the product of a transform coefficient for a given scalefactor band with its respective total scaling value is less than a corresponding one of the distortion thresholds; and wherein a given total scaling value Asfb for a particular frequency scalefactor band is calculated according to the equation;
Asfb=2[4/(9BWsfb)]2/3*(1/Msfb)2/3*(Σ
xi)1/3,where BWsfb is the bandwidth of the particular frequency scalefactor band, Msfb is the corresponding distortion threshold, and Σ
xi is the sum of all of the transform coefficients for the particular scalefactor band. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A computer program product comprising:
-
a computer-readable storage medium; and program instructions stored on said storage medium for calculating a plurality of total scaling values associated with different frequency scalefactor bands of a signal, using transform coefficients of the signal and distortion thresholds for each frequency scalefactor band, such that the product of a transform coefficient for a given scalefactor band with its respective total scaling value is less than a corresponding one of the distortion thresholds; and wherein said program instructions calculate a given total scaling value Asfb for a particular frequency scalefactor band according to the equation;
Asfb=2[4/(9BWsfb)]2/3*(1/Msfb)2/3*(Σ
xi)1/3,where BWsfb is the bandwidth of the particular frequency scalefactor band, Msfb is the corresponding distortion threshold, and Σ
xi is the sum of all of the transform coefficients for the particular scalefactor band. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30)
-
Specification