System and method for non-destructively normalizing loudness of audio signals within portable devices
First Claim
1. A method for decoding an encoded input signal to generate an audio output signal, wherein the method comprises:
- receiving the encoded input signal that includes encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and optionally including one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that were set according to an encoding process that generated the encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that were set according to the encoding process that generated the encoded audio information to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level that is higher than the first reference reproduction level;
applying a decoding process to the encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters;
modifying the subband signals to obtain modified subband signals with changed dynamic range characteristics, wherein the modifying is adapted in response to the one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to the one or more first parameters if the metadata does not include the one or more second parameters;
applying a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and
if the metadata does not include the one or more second parameters, applying a gain and a limiter to the time-domain audio signal in response to the metadata, wherein the application of the gain modifies the time-domain audio signal to obtain the audio output signal with amplitudes for playback at the second reference reproduction level, and wherein the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level.
1 Assignment
0 Petitions
Accused Products
Abstract
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
31 Citations
24 Claims
-
1. A method for decoding an encoded input signal to generate an audio output signal, wherein the method comprises:
-
receiving the encoded input signal that includes encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and optionally including one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that were set according to an encoding process that generated the encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that were set according to the encoding process that generated the encoded audio information to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level that is higher than the first reference reproduction level; applying a decoding process to the encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters; modifying the subband signals to obtain modified subband signals with changed dynamic range characteristics, wherein the modifying is adapted in response to the one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to the one or more first parameters if the metadata does not include the one or more second parameters; applying a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and if the metadata does not include the one or more second parameters, applying a gain and a limiter to the time-domain audio signal in response to the metadata, wherein the application of the gain modifies the time-domain audio signal to obtain the audio output signal with amplitudes for playback at the second reference reproduction level, and wherein the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level. - View Dependent Claims (2)
-
-
3. A method for encoding an audio input signal representing aural stimuli, wherein the method comprises:
-
receiving the audio input signal; applying an analysis filter bank to the audio input signal to generate subband signals representing spectral content of the audio input signal; analyzing one or more signals derived from the audio input signal to calculate metadata including one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; applying an encoding process to the subband signals to obtain encoded audio information; and assembling the encoded audio information and the metadata into an encoded output signal having a format suitable for transmission or storage, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile. - View Dependent Claims (4)
-
-
5. A method for transcoding an encoded input signal to generate an encoded output signal, wherein the method comprises:
-
receiving the encoded input signal that includes first encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile, wherein the one or more first parameters have values that were set according to a first encoding process that generated the first encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level; applying a decoding process to the first encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters; analyzing one or more signals obtained from the subband signals to calculate one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; and assembling second encoded audio information, the one or more first parameters and the one or more second parameters into an encoded output signal having a format suitable for transmission or storage, wherein the second encoded audio information is an encoded representation of the subband signals. - View Dependent Claims (6, 7, 8)
-
-
9. An apparatus for decoding an encoded input signal to generate an audio output signal, wherein the method comprises:
-
means for receiving the encoded input signal that includes encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and optionally including one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that were set according to an encoding process that generated the encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that were set according to the encoding process that generated the encoded audio information to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level that is higher than the first reference reproduction level; means for applying a decoding process to the encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters; means for modifying the subband signals to obtain modified subband signals with changed dynamic range characteristics, wherein the modifying is adapted in response to the one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to the one or more first parameters if the metadata does not include the one or more second parameters; means for applying a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and means for applying a gain and a limiter to the time-domain audio signal in response to the metadata if the metadata does not include the one or more second parameters, wherein the application of the gain modifies the time-domain audio signal to obtain the audio output signal with amplitudes for playback at the second reference reproduction level, and wherein the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level. - View Dependent Claims (10)
-
-
11. An apparatus for encoding an audio input signal representing aural stimuli, wherein the method comprises:
-
means for receiving the audio input signal; means for applying an analysis filter bank to the audio input signal to generate subband signals representing spectral content of the audio input signal; means for analyzing one or more signals derived from the audio input signal to calculate metadata including one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; means for applying an encoding process to the subband signals to obtain encoded audio information; and means for assembling the encoded audio information and the metadata into an encoded output signal having a format suitable for transmission or storage, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile. - View Dependent Claims (12)
-
-
13. An apparatus for transcoding an encoded input signal to generate an encoded output signal, wherein the method comprises:
-
means for receiving the encoded input signal that includes first encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile, wherein the one or more first parameters have values that were set according to a first encoding process that generated the first encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level; means for applying a decoding process to the first encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters; means for analyzing one or more signals obtained from the subband signals to calculate one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; and means for assembling second encoded audio information, the one or more first parameters and the one or more second parameters into an encoded output signal having a format suitable for transmission or storage, wherein the second encoded audio information is an encoded representation of the subband signals. - View Dependent Claims (14, 15, 16)
-
-
17. A non-transitory medium recording a program of instructions that is executable by a device to perform a method for decoding an encoded input signal to generate an audio output signal, wherein the method comprises:
-
receiving the encoded input signal that includes encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and optionally including one or more second parameters specifying′
dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that were set according to an encoding process that generated the encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that were set according to the encoding process that generated the encoded audio information to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level that is higher than the first reference reproduction level;applying a decoding process to the encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters; modifying the subband signals to obtain modified subband signals with changed dynamic range characteristics, wherein the modifying is adapted in response to the one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to the one or more first parameters if the metadata does not include the one or more second parameters; applying a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and if the metadata does not include the one or more second parameters, applying a gain and a limiter to the time-domain audio signal in response to the metadata, wherein the application of the gain modifies the time-domain audio signal to obtain the audio output signal with amplitudes for playback at the second reference reproduction level, and wherein the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level. - View Dependent Claims (18)
-
-
19. A non-transitory medium recording a program of instructions that is executable by a device to perform a method for encoding an audio input signal representing aural stimuli, wherein the method comprises:
-
receiving the audio input signal; applying an analysis filter bank to the audio input signal to generate subband signals representing spectral content of the audio input signal; analyzing one or more signals derived from the audio input signal to calculate metadata including one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; applying an encoding process to the subband signals to obtain encoded audio information; and assembling the encoded audio information and the metadata into an encoded output signal having a format suitable for transmission or storage, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile. - View Dependent Claims (20)
-
-
21. A non-transitory medium recording a program of instructions that is executable by a device to perform a method for transcoding an encoded input signal to generate an encoded output signal, wherein the method comprises:
-
receiving the encoded input signal that includes first encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile, wherein the one or more first parameters have values that were set according to a first encoding process that generated the first encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level; applying a decoding process to the first encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters; analyzing one or more signals obtained from the subband signals to calculate one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; and assembling second encoded audio information, the one or more first parameters and the one or more second parameters into an encoded output signal having a format suitable for transmission or storage, wherein the second encoded audio information is an encoded representation of the subband signals. - View Dependent Claims (22, 23, 24)
-
Specification