System and method for non-destructively normalizing loudness of audio signals within portable devices

US 8,903,729 B2
Filed: 02/03/2011
Issued: 12/02/2014
Est. Priority Date: 02/11/2010
Status: Active Grant

First Claim

Patent Images

1. A method for decoding an encoded input signal to generate an audio output signal, wherein the method comprises:

receiving the encoded input signal that includes encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and optionally including one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that were set according to an encoding process that generated the encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that were set according to the encoding process that generated the encoded audio information to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level that is higher than the first reference reproduction level;

applying a decoding process to the encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters;

modifying the subband signals to obtain modified subband signals with changed dynamic range characteristics, wherein the modifying is adapted in response to the one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to the one or more first parameters if the metadata does not include the one or more second parameters;

applying a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and

if the metadata does not include the one or more second parameters, applying a gain and a limiter to the time-domain audio signal in response to the metadata, wherein the application of the gain modifies the time-domain audio signal to obtain the audio output signal with amplitudes for playback at the second reference reproduction level, and wherein the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.

31 Citations

View as Search Results

24 Claims

1. A method for decoding an encoded input signal to generate an audio output signal, wherein the method comprises:
- receiving the encoded input signal that includes encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and optionally including one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that were set according to an encoding process that generated the encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that were set according to the encoding process that generated the encoded audio information to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level that is higher than the first reference reproduction level;
  
  applying a decoding process to the encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters;
  
  modifying the subband signals to obtain modified subband signals with changed dynamic range characteristics, wherein the modifying is adapted in response to the one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to the one or more first parameters if the metadata does not include the one or more second parameters;
  
  applying a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and
  
  if the metadata does not include the one or more second parameters, applying a gain and a limiter to the time-domain audio signal in response to the metadata, wherein the application of the gain modifies the time-domain audio signal to obtain the audio output signal with amplitudes for playback at the second reference reproduction level, and wherein the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level.
- View Dependent Claims (2)
- - 2. The method of claim 1, wherein the encoded input signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

3. A method for encoding an audio input signal representing aural stimuli, wherein the method comprises:
- receiving the audio input signal;
  
  applying an analysis filter bank to the audio input signal to generate subband signals representing spectral content of the audio input signal;
  
  analyzing one or more signals derived from the audio input signal to calculate metadata including one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level;
  
  applying an encoding process to the subband signals to obtain encoded audio information; and
  
  assembling the encoded audio information and the metadata into an encoded output signal having a format suitable for transmission or storage, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile.
- View Dependent Claims (4)
- - 4. The method of claim 3, wherein the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

5. A method for transcoding an encoded input signal to generate an encoded output signal, wherein the method comprises:
- receiving the encoded input signal that includes first encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile, wherein the one or more first parameters have values that were set according to a first encoding process that generated the first encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level;
  
  applying a decoding process to the first encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters;
  
  analyzing one or more signals obtained from the subband signals to calculate one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; and
  
  assembling second encoded audio information, the one or more first parameters and the one or more second parameters into an encoded output signal having a format suitable for transmission or storage, wherein the second encoded audio information is an encoded representation of the subband signals.
- View Dependent Claims (6, 7, 8)
- - 6. The method of claim 5, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile.
  - 7. The method of claim 5 that comprises applying a synthesis filter bank to the subband signals to obtain the one or more signals that are analyzed to calculate the one or more second parameters specifying dynamic range compression.
  - 8. The method of claim 5, wherein the encoded input signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, and the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and wherein the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

9. An apparatus for decoding an encoded input signal to generate an audio output signal, wherein the method comprises:
- means for receiving the encoded input signal that includes encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and optionally including one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that were set according to an encoding process that generated the encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that were set according to the encoding process that generated the encoded audio information to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level that is higher than the first reference reproduction level;
  
  means for applying a decoding process to the encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters;
  
  means for modifying the subband signals to obtain modified subband signals with changed dynamic range characteristics, wherein the modifying is adapted in response to the one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to the one or more first parameters if the metadata does not include the one or more second parameters;
  
  means for applying a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and
  
  means for applying a gain and a limiter to the time-domain audio signal in response to the metadata if the metadata does not include the one or more second parameters, wherein the application of the gain modifies the time-domain audio signal to obtain the audio output signal with amplitudes for playback at the second reference reproduction level, and wherein the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level.
- View Dependent Claims (10)
- - 10. The apparatus of claim 9, wherein the encoded input signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

11. An apparatus for encoding an audio input signal representing aural stimuli, wherein the method comprises:
- means for receiving the audio input signal;
  
  means for applying an analysis filter bank to the audio input signal to generate subband signals representing spectral content of the audio input signal;
  
  means for analyzing one or more signals derived from the audio input signal to calculate metadata including one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level;
  
  means for applying an encoding process to the subband signals to obtain encoded audio information; and
  
  means for assembling the encoded audio information and the metadata into an encoded output signal having a format suitable for transmission or storage, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile.
- View Dependent Claims (12)
- - 12. The apparatus of claim 11, wherein the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

13. An apparatus for transcoding an encoded input signal to generate an encoded output signal, wherein the method comprises:
- means for receiving the encoded input signal that includes first encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile, wherein the one or more first parameters have values that were set according to a first encoding process that generated the first encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level;
  
  means for applying a decoding process to the first encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters;
  
  means for analyzing one or more signals obtained from the subband signals to calculate one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; and
  
  means for assembling second encoded audio information, the one or more first parameters and the one or more second parameters into an encoded output signal having a format suitable for transmission or storage, wherein the second encoded audio information is an encoded representation of the subband signals.
- View Dependent Claims (14, 15, 16)
- - 14. The apparatus of claim 13, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile.
  - 15. The apparatus of claim 13 that comprises means for applying a synthesis filter bank to the subband signals to obtain the one or more signals that are analyzed to calculate the one or more second parameters specifying dynamic range compression.
  - 16. The apparatus of claim 13, wherein the encoded input signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, and the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and wherein the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

17. A non-transitory medium recording a program of instructions that is executable by a device to perform a method for decoding an encoded input signal to generate an audio output signal, wherein the method comprises:
- receiving the encoded input signal that includes encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and optionally including one or more second parameters specifying′
  
  dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that were set according to an encoding process that generated the encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that were set according to the encoding process that generated the encoded audio information to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level that is higher than the first reference reproduction level;
  
  applying a decoding process to the encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters;
  
  modifying the subband signals to obtain modified subband signals with changed dynamic range characteristics, wherein the modifying is adapted in response to the one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to the one or more first parameters if the metadata does not include the one or more second parameters;
  
  applying a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and
  
  if the metadata does not include the one or more second parameters, applying a gain and a limiter to the time-domain audio signal in response to the metadata, wherein the application of the gain modifies the time-domain audio signal to obtain the audio output signal with amplitudes for playback at the second reference reproduction level, and wherein the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level.
- View Dependent Claims (18)
- - 18. The medium of claim 17, wherein the encoded input signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

19. A non-transitory medium recording a program of instructions that is executable by a device to perform a method for encoding an audio input signal representing aural stimuli, wherein the method comprises:
- receiving the audio input signal;
  
  applying an analysis filter bank to the audio input signal to generate subband signals representing spectral content of the audio input signal;
  
  analyzing one or more signals derived from the audio input signal to calculate metadata including one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile and one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more first parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level, and wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level;
  
  applying an encoding process to the subband signals to obtain encoded audio information; and
  
  assembling the encoded audio information and the metadata into an encoded output signal having a format suitable for transmission or storage, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile.
- View Dependent Claims (20)
- - 20. The medium of claim 19, wherein the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

21. A non-transitory medium recording a program of instructions that is executable by a device to perform a method for transcoding an encoded input signal to generate an encoded output signal, wherein the method comprises:
- receiving the encoded input signal that includes first encoded audio information and associated metadata including one or more decoding-control parameters and one or more first parameters specifying dynamic range compression according to a first dynamic range compression profile, wherein the one or more first parameters have values that were set according to a first encoding process that generated the first encoded audio information to represent aural stimuli with amplitudes that do not exceed a clipping level for playback at a first reference reproduction level;
  
  applying a decoding process to the first encoded audio information to obtain subband signals representing spectral content of the aural stimuli, wherein the decoding process is adapted in response to the one or more decoding-control parameters;
  
  analyzing one or more signals obtained from the subband signals to calculate one or more second parameters specifying dynamic range compression according to a second dynamic range compression profile, wherein the one or more second parameters have values that are set to represent the aural stimuli with amplitudes that do not exceed the clipping level for playback at a second reference reproduction level; and
  
  assembling second encoded audio information, the one or more first parameters and the one or more second parameters into an encoded output signal having a format suitable for transmission or storage, wherein the second encoded audio information is an encoded representation of the subband signals.
- View Dependent Claims (22, 23, 24)
- - 22. The medium of claim 21, wherein the one or more second parameters represent differences between corresponding parameters for the first dynamic range compression profile and the second dynamic range compression profile.
  - 23. The medium of claim 21, wherein the method comprises applying a synthesis filter bank to the subband signals to obtain the one or more signals that are analyzed to calculate the one or more second parameters specifying dynamic range compression.
  - 24. The medium of claim 21, wherein the encoded input signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, and the first reference reproduction level corresponds to an amplitude 20 dB below the clipping level, and wherein the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard, or the MPEG-4 Audio Standard, and the second reference reproduction level corresponds to an amplitude 11 dB below the clipping level.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dolby International AB (Dolby Laboratories Incorporated), Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Original Assignee
Dolby International AB (Dolby Laboratories Incorporated), Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Inventors
Riedmiller, Jeffrey Charles, Schug, Michael, Wolters, Martin, Mundt, Harald Helge
Primary Examiner(s)
Saint Cyr, Leonard

Application Number

US13/576,386
Publication Number

US 20120310654A1
Time in Patent Office

1,398 Days
Field of Search

704500-504
US Class Current

704/500
CPC Class Codes

G10L 19/02   using spectral analysis, e....

G10L 19/0208   Subband vocoders

G10L 19/167   Audio streaming, i.e. forma...

G10L 19/22   Mode decision, i.e. based o...

G10L 19/26   Pre-filtering or post-filte...

H03G 3/3089   Control of digital or coded...

H03G 3/32   the control being dependent...

H03G 7/007   of digital or coded signals

System and method for non-destructively normalizing loudness of audio signals within portable devices

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

31 Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for non-destructively normalizing loudness of audio signals within portable devices

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

31 Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links