Method for Correcting Metadata Affecting the Playback Loudness of Audio Information

US 20100250258A1
Filed: 09/24/2009
Published: 09/30/2010
Est. Priority Date: 07/01/2004
Status: Active Grant

First Claim

Patent Images

1. A method for correcting playback loudness of audio information, wherein the method comprises steps that:

receive an input signal that conveys data representing a first loudness normalization level and first encoded audio information, wherein the data conveyed by the input signal was produced by an encoding process that generated the first encoded audio information according to psychoacoustic principles;

obtain segments of decoded audio information from an application of a decoding process to the input signal;

identify which of the segments of decoded audio information are predominantly speech;

obtain a respective measure of loudness for each of the segments of audio information from an analysis of the decoded audio information that accounts for presence or absence of speech and derive a second loudness normalization level for each segment from its respective measure of loudness;

generate an output signal that conveys data representing a third loudness normalization level and segments of third encoded audio information representing the segments of decoded audio information in an encoded form, wherein;

if a difference between the first and second loudness normalization levels does not exceed a threshold, the third loudness level represents the first loudness normalization level, and the third encoded audio information represents the first encoded audio information, andif the difference between the first and second loudness normalization levels exceeds the threshold, the third loudness level is derived from the second loudness normalization level.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A coded signal conveys encoded audio information and metadata that may be used to control the loudness of the audio information during its playback. If the values for these metadata parameters are set incorrectly, annoying fluctuations in loudness during playback can result. The present invention overcomes this problem by detecting incorrect metadata parameter values in the signal and replacing the incorrect values with corrected values.

55 Citations

View as Search Results

12 Claims

1. A method for correcting playback loudness of audio information, wherein the method comprises steps that:
- receive an input signal that conveys data representing a first loudness normalization level and first encoded audio information, wherein the data conveyed by the input signal was produced by an encoding process that generated the first encoded audio information according to psychoacoustic principles;
  
  obtain segments of decoded audio information from an application of a decoding process to the input signal;
  
  identify which of the segments of decoded audio information are predominantly speech;
  
  obtain a respective measure of loudness for each of the segments of audio information from an analysis of the decoded audio information that accounts for presence or absence of speech and derive a second loudness normalization level for each segment from its respective measure of loudness;
  
  generate an output signal that conveys data representing a third loudness normalization level and segments of third encoded audio information representing the segments of decoded audio information in an encoded form, wherein;
  
  if a difference between the first and second loudness normalization levels does not exceed a threshold, the third loudness level represents the first loudness normalization level, and the third encoded audio information represents the first encoded audio information, andif the difference between the first and second loudness normalization levels exceeds the threshold, the third loudness level is derived from the second loudness normalization level.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1 wherein, for each segment of decoded audio information that is predominantly speech, the respective measure of loudness represents loudness of the speech in the segment, and for each segment of decoded audio information that is not predominantly speech, the respective measure of loudness represents an average loudness of the audio information.
  - 3. The method of claim 1 wherein, if the difference between the first and second loudness normalization levels exceeds the threshold, the third encoded audio information is generated by encoding the decoded audio information according to psychoacoustic principles.
  - 4. The method of claim 1 wherein, if the difference between the first and second loudness normalization levels exceeds the threshold, the third encoded audio information represents the first encoded audio information.

5. An apparatus for correcting playback loudness of audio information, wherein the apparatus comprises:
- means for receiving an input signal that conveys data representing a first loudness normalization level and first encoded audio information, wherein the data conveyed by the input signal was produced by an encoding process that generated the first encoded audio information according to psychoacoustic principles;
  
  means for obtaining segments of decoded audio information from an application of a decoding process to the input signal;
  
  means for identifying which of the segments of decoded audio information are predominantly speech;
  
  means for obtaining a respective measure of loudness for each of the segments of audio information from an analysis of the decoded audio information that accounts for presence or absence of speech and derive a second loudness normalization level for each segment from its respective measure of loudness;
  
  means for generating an output signal that conveys data representing a third loudness normalization level and segments of third encoded audio information representing the segments of decoded audio information in an encoded form, wherein;
  
  if a difference between the first and second loudness normalization levels does not exceed a threshold, the third loudness level represents the first loudness normalization level, and the third encoded audio information represents the first encoded audio information, andif the difference between the first and second loudness normalization levels exceeds the threshold, the third loudness level is derived from the second loudness normalization level.
- View Dependent Claims (6, 7, 8)
- - 6. The apparatus of claim 5 wherein, for each segment of decoded audio information that is predominantly speech, the respective measure of loudness represents loudness of the speech in the segment, and for each segment of decoded audio information that is not predominantly speech, the respective measure of loudness represents an average loudness of the audio information.
  - 7. The apparatus of claim 5 wherein, if the difference between the first and second loudness normalization levels exceeds the threshold, the third encoded audio information is generated by encoding the decoded audio information according to psychoacoustic principles.
  - 8. The apparatus of claim 5 wherein, if the difference between the first and second loudness normalization levels exceeds the threshold, the third encoded audio information represents the first encoded audio information.

9. A storage medium recording a program of instructions that is executable by device to perform a method for correcting playback loudness of audio information, wherein the method comprises steps that:
- receive an input signal that conveys data representing a first loudness normalization level and first encoded audio information, wherein the data conveyed by the input signal was produced by an encoding process that generated the first encoded audio information according to psychoacoustic principles;
  
  obtain segments of decoded audio information from an application of a decoding process to the input signal;
  
  identify which of the segments of decoded audio information are predominantly speech;
  
  obtain a respective measure of loudness for each of the segments of audio information from an analysis of the decoded audio information that accounts for presence or absence of speech and derive a second loudness normalization level for each segment from its respective measure of loudness;
  
  generate an output signal that conveys data representing a third loudness normalization level and segments of third encoded audio information representing the segments of decoded audio information in an encoded form, wherein;
  
  if a difference between the first and second loudness normalization levels does not exceed a threshold, the third loudness level represents the first loudness normalization level, and the third encoded audio information represents the first encoded audio information, andif the difference between the first and second loudness normalization levels exceeds the threshold, the third loudness level is derived from the second loudness normalization level.
- View Dependent Claims (10, 11, 12)
- - 10. The medium of claim 9 wherein, for each segment of decoded audio information that is predominantly speech, the respective measure of loudness represents loudness of the speech in the segment, and for each segment of decoded audio information that is not predominantly speech, the respective measure of loudness represents an average loudness of the audio information.
  - 11. The medium of claim 9 wherein, if the difference between the first and second loudness normalization levels exceeds the threshold, the third encoded audio information is generated by encoding the decoded audio information according to psychoacoustic principles.
  - 12. The medium of claim 9 wherein, if the difference between the first and second loudness normalization levels exceeds the threshold, the third encoded audio information represents the first encoded audio information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Original Assignee
Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Inventors
Riedmiller, Jeffrey Charles, Smithers, Michael John, Robinson, Charles Quito, Crockett, Brett Graham

Granted Patent

US 8,032,385 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/500
CPC Class Codes

G10L 25/00 Speech or voice analysis te...

H03G 9/005 of digital or coded signals

Method for Correcting Metadata Affecting the Playback Loudness of Audio Information

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

55 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Method for Correcting Metadata Affecting the Playback Loudness of Audio Information

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

55 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links