Low time-delay transform coder, decoder, and encoder/decoder for high-quality audio
First Claim
1. An encoder for the encoding of audio information comprising signal samples, said encoder comprisingmeans for receiving said signal samples,subband means, including adaptive bit allocation means, for defining subbands and for generating subband information in response to said signal samples, said subband information for each of said subbands including one or more digital words, each of said digital words comprising an adaptive portion and a non-adaptive portion, wherein coding accuracy of said adaptive portion is established by said adaptive bit allocation means, andformatting means for assembling digital information including said subband information into a digital output having a format suitable for transmission or storage.
2 Assignments
0 Petitions
Accused Products
Abstract
A low bit-rate (192 kBits per second) transform encoder/decoder system (44.1 kHz or 48 kHz sampling rate) for high-quality music applications employs short time-domain sample blocks (128 samples/block) so that the system signal propagation delay is short enough for real-time aural feedback to a human operator. Carefully designed pairs of analysis/synthesis windows are used to achieve sufficient transform frequency selectivity despite the use of short sample blocks. A synthesis window in the decoder has characteristics such that the product of its response and that of an analysis window in the encoder produces a composite response which sums to unity for two adjacent overlapped sample blocks. Adjacent time-domain signal samples blocks are overlapped and added to cancel the effects of the analysis and synthesis windows. A technique is provided for deriving suitable analysis/synthesis window pairs. In the encoder, a discrete transform having a function equivalent to the alternate application of a modified Discrete Cosine Transform and a modified Discrete Sine Transform according to the Time Domain Aliasing Cancellation technique or, alternatively, a Discrete Fourier Transform is used to generate frequency-domain transform coefficients. The transform coefficients are nonuniformly quantized by assigning a fixed number of bits and a variable number of bits determined adaptively based on psychoacoustic masking. A technique is described for assigning the fixed bit and adaptive bit allocations. The transmission of side information regarding adaptively allocated bits is not required. Error codes and protected data may be scattered throughout formatted frame outputs from the encoder in order to reduce sensitivity to noise bursts.
-
Citations
158 Claims
-
1. An encoder for the encoding of audio information comprising signal samples, said encoder comprising
means for receiving said signal samples, subband means, including adaptive bit allocation means, for defining subbands and for generating subband information in response to said signal samples, said subband information for each of said subbands including one or more digital words, each of said digital words comprising an adaptive portion and a non-adaptive portion, wherein coding accuracy of said adaptive portion is established by said adaptive bit allocation means, and formatting means for assembling digital information including said subband information into a digital output having a format suitable for transmission or storage.
-
32. An encoder for the encoding of audio information comprising signal samples, said encoder having a short signal propagation delay, comprising
means for receiving and grouping said signal samples into overlapping signal sample blocks, the length of the overlap constituting an overlap interval, said signal sample blocks having a time period resulting in a signal propagation delay short enough so that an encoding/decoding system employing the encoder is usable for real-time aural feedback to a human operator, analysis-window means for weighting each signal sample block by an analysis window, wherein said analysis window constitutes one window of an analysis-synthesis window pair, wherein the product of both windows in said window pair is equal to a product window prederived from an analysis-only window permitting the design of a filter bank in which transform-based digital filters have the ability to trade off steepness of transition band rolloff against depth of stopband rejection in the filter characteristics, and wherein said product window overlapped with itself sums to a constant value across the overlap interval, means for generating transform coefficients by applying a discrete transform function to each of said analysis-window weighted signal sample blocks, means for quantizing each of said transform coefficients, and formatting means for assembling the quantized transform coefficients into a digital output having a format suitable for transmission or storage.
-
40. A decoder for the reproduction of audio information comprising signal samples from a coded signal including digital information, said decoder comprising
deformatting means, including adaptive bit allocation means, for defining subbands and for deriving subband information in response to said coded signal, and for reconstructing digital words using said derived subband information, said digital words comprising an adaptive portion and a non-adaptive portion, wherein coding accuracy of said adaptive portion is established by said adaptive bit allocation means, inverse subband means for generating signal samples in response to said subband information, and means for generating said reproduction of audio information in response to said signal samples.
-
60. A decoder according to 59 wherein said deformatting means reconstructs each digital word from bits representing said non-adaptive portion and bits representing said one or more exponents which occupy positions in said subband information block ahead of bits representing said adaptive portion.
-
70. A decoder according to 69 wherein said deformatting means reconstructs each digital word from bits representing said non-adaptive portion which occupy positions in said subband information block ahead of bits representing said adaptive portion.
-
71. A decoder for the reproduction of audio information comprising signal samples from a coded signal generated by an encoder that groups said signal samples into overlapping signal sample blocks, the length of the overlap constituting an overlap interval, weights each sample block with an analysis window, generates transform coefficients by applying a discrete transform to the analysis-window weighted signal sample blocks, quantizes each transform coefficient and assembles the quantized transform coefficients into a digital output having a format suitable for transmission or storage, said decoder comprising
means for receiving said digital output for deriving said quantized transform coefficients therefrom, means for reconstructing decoded transform coefficients from the deformatted quantized transform coefficients, means for generating signal sample blocks by applying an inverse discrete transform function to said decoded transform coefficients, said inverse discrete transform having characteristics inverse to those of said discrete transform in the encoder, said signal sample blocks having a time period resulting in a signal propagation delay short enough so that an encoding/decoding system employing the decoder is usable for real-time aural feedback to a human operator, synthesis window means for weighting the signal sample blocks by a synthesis window, wherein a product window equal to the product of said synthesis window and said analysis window is prederived from an analysis-only window permitting the design of a filter bank in which transform-based digital filters have the ability to trade off steepness of transition band rolloff against depth of stopband rejection in the filter characteristics, and wherein said product window overlapped with itself sums to a constant value across the overlap interval, and means for cancelling the weighting effects of the analysis window means and the synthesis window means to recover said signal samples by adding overlapped signal sample blocks across said overlap interval.
-
79. An encoding method for the encoding of audio information comprising signal samples, said encoding method comprising
receiving said signal samples, defining subbands and generating subband information in response to said signal samples, said subband information for each of said subbands including one or more digital words, each of said digital words comprising an adaptive portion and a non-adaptive portion, wherein coding accuracy of said adaptive portion is established by adaptive bit allocating, and assembling digital information including said subband information into a digital output having a format suitable for transmission or storage.
-
110. An encoding method for the encoding of audio information comprising signal samples, said encoding method having a short signal propagation delay, comprising
receiving and grouping said signal samples into overlapping signal sample blocks, the length of the overlap constituting an overlap interval, said signal sample blocks having a time period resulting in a signal propagation delay short enough so that an encoding/decoding method employing the encoding method is usable for real-time aural feedback to a human operator, weighting each signal sample block by an analysis window, wherein said analysis window constitutes one window of an analysis-synthesis window pair, wherein the product of both windows in said window pair is equal to a product window prederived from an analysis-only window permitting the design of a filter bank in which transform-based digital filters have the ability to trade off steepness of transition band rolloff against depth of stopband rejection in the filter characteristics, and wherein said product window overlapped with itself sums to a constant value across the overlap interval, generating transform coefficients by applying a discrete transform function to each of said analysis-window weighted signal sample blocks, quantizing each of said transform coefficients, and assembling the quantized transform coefficients into a digital output having a format suitable for transmission or storage.
-
118. A decoding method for the reproduction of audio information comprising signal samples from a coded signal including digital information, said decoding method comprising
defining subbands and deriving subband information in response to said coded signal, and reconstructing digital words using said derived subband information, said digital words comprising an adaptive portion and a non-adaptive portion, wherein coding accuracy of said adaptive portion is established by adaptive bit allocating, generating signal samples in response to said subband information, and generating said reproduction of audio information in response to said signal samples.
-
138. A decoding method according to 137 wherein said reconstructing digital words reconstructs each digital word from bits representing said non-adaptive portion and bits representing said one or more exponents which occupy positions in said subband information block ahead of bits representing said adaptive portion.
-
148. A decoding method according to 147 wherein said reconstructing digital words reconstructs each digital word from bits representing said non-adaptive portion which occupy positions in said subband information block ahead of bits representing said adaptive portion.
-
149. A decoding method for the reproduction of audio information comprising signal samples from a coded signal generated by an encoding method that groups said signal samples into overlapping signal sample blocks, the length of the overlap constituting an overlap interval, weights each sample block with an analysis window, generates transform coefficients by applying a discrete transform to the analysis-window weighted signal sample blocks, quantizes each transform coefficient and assembles the quantized transform coefficients into a digital output having a format suitable for transmission or storage, said decoding method comprising
receiving said digital output for deriving said quantized transform coefficients therefrom, reconstructing decoded transform coefficients from the deformatted quantized transform coefficients, generating signal sample blocks by applying an inverse discrete transform function to said decoded transform coefficients, said inverse discrete transform having characteristics inverse to those of said discrete transform in the encoding method, said signal sample blocks having a time period resulting in a signal propagation delay short enough so that an encoding/decoding method employing the decoding method is usable for real-time aural feedback to a human operator, weighting the signal sample blocks by a synthesis window, wherein a product window equal to the product of said synthesis window and said analysis window is prederived from an analysis-only window permitting the design of a filter bank in which transform-based digital filters have the ability to trade off steepness of transition band rolloff against depth of stopband rejection in the filter characteristics, and wherein said product window overlapped with itself sums to a constant value across the overlap interval, and cancelling the weighting effects of the analysis window and the synthesis window to recover said signal samples by adding overlapped signal sample blocks across said overlap interval.
-
157. A method for defining coding information which defines the coding accuracy of digital words representing spectral information in a plurality of frequency subbands, said digital words generated in response to an input signal by a split-band encoder comprising a filter bank, wherein said coding information comprises a nonadaptive coding accuracy, said method comprising
(1) obtaining a predicted quantizing noise spectrum of said split-band encoder for a frequency subband based upon a representative frequency response of said filter bank for said frequency subband, (2) generating a subband value equal to the number of bits required to quantize spectral energy within said frequency subband such that said predicted quantizing noise spectrum does not exceed a representative psychoacoustic masking threshold for spectral energy within said frequency subband, (3) setting said nonadaptive coding accuracy for said frequency subband equal to or less than said subband value, and (4) reiterating the previous steps for each of said plurality of frequency subbands.
Specification