Method for Correcting Metadata Affecting the Playback Loudness of Audio Information
First Claim
Patent Images
1. A method for correcting playback loudness of audio information, wherein the method comprises steps that:
- receive an input signal that conveys data representing a first loudness normalization level and first encoded audio information, wherein the data conveyed by the input signal was produced by an encoding process that generated the first encoded audio information according to psychoacoustic principles;
obtain segments of decoded audio information from an application of a decoding process to the input signal;
identify which of the segments of decoded audio information are predominantly speech;
obtain a respective measure of loudness for each of the segments of audio information from an analysis of the decoded audio information that accounts for presence or absence of speech and derive a second loudness normalization level for each segment from its respective measure of loudness;
generate an output signal that conveys data representing a third loudness normalization level and segments of third encoded audio information representing the segments of decoded audio information in an encoded form, wherein;
if a difference between the first and second loudness normalization levels does not exceed a threshold, the third loudness level represents the first loudness normalization level, and the third encoded audio information represents the first encoded audio information, andif the difference between the first and second loudness normalization levels exceeds the threshold, the third loudness level is derived from the second loudness normalization level.
1 Assignment
0 Petitions
Accused Products
Abstract
A coded signal conveys encoded audio information and metadata that may be used to control the loudness of the audio information during its playback. If the values for these metadata parameters are set incorrectly, annoying fluctuations in loudness during playback can result. The present invention overcomes this problem by detecting incorrect metadata parameter values in the signal and replacing the incorrect values with corrected values.
55 Citations
12 Claims
-
1. A method for correcting playback loudness of audio information, wherein the method comprises steps that:
-
receive an input signal that conveys data representing a first loudness normalization level and first encoded audio information, wherein the data conveyed by the input signal was produced by an encoding process that generated the first encoded audio information according to psychoacoustic principles; obtain segments of decoded audio information from an application of a decoding process to the input signal; identify which of the segments of decoded audio information are predominantly speech; obtain a respective measure of loudness for each of the segments of audio information from an analysis of the decoded audio information that accounts for presence or absence of speech and derive a second loudness normalization level for each segment from its respective measure of loudness; generate an output signal that conveys data representing a third loudness normalization level and segments of third encoded audio information representing the segments of decoded audio information in an encoded form, wherein; if a difference between the first and second loudness normalization levels does not exceed a threshold, the third loudness level represents the first loudness normalization level, and the third encoded audio information represents the first encoded audio information, and if the difference between the first and second loudness normalization levels exceeds the threshold, the third loudness level is derived from the second loudness normalization level. - View Dependent Claims (2, 3, 4)
-
-
5. An apparatus for correcting playback loudness of audio information, wherein the apparatus comprises:
-
means for receiving an input signal that conveys data representing a first loudness normalization level and first encoded audio information, wherein the data conveyed by the input signal was produced by an encoding process that generated the first encoded audio information according to psychoacoustic principles; means for obtaining segments of decoded audio information from an application of a decoding process to the input signal; means for identifying which of the segments of decoded audio information are predominantly speech; means for obtaining a respective measure of loudness for each of the segments of audio information from an analysis of the decoded audio information that accounts for presence or absence of speech and derive a second loudness normalization level for each segment from its respective measure of loudness; means for generating an output signal that conveys data representing a third loudness normalization level and segments of third encoded audio information representing the segments of decoded audio information in an encoded form, wherein; if a difference between the first and second loudness normalization levels does not exceed a threshold, the third loudness level represents the first loudness normalization level, and the third encoded audio information represents the first encoded audio information, and if the difference between the first and second loudness normalization levels exceeds the threshold, the third loudness level is derived from the second loudness normalization level. - View Dependent Claims (6, 7, 8)
-
-
9. A storage medium recording a program of instructions that is executable by device to perform a method for correcting playback loudness of audio information, wherein the method comprises steps that:
-
receive an input signal that conveys data representing a first loudness normalization level and first encoded audio information, wherein the data conveyed by the input signal was produced by an encoding process that generated the first encoded audio information according to psychoacoustic principles; obtain segments of decoded audio information from an application of a decoding process to the input signal; identify which of the segments of decoded audio information are predominantly speech; obtain a respective measure of loudness for each of the segments of audio information from an analysis of the decoded audio information that accounts for presence or absence of speech and derive a second loudness normalization level for each segment from its respective measure of loudness; generate an output signal that conveys data representing a third loudness normalization level and segments of third encoded audio information representing the segments of decoded audio information in an encoded form, wherein; if a difference between the first and second loudness normalization levels does not exceed a threshold, the third loudness level represents the first loudness normalization level, and the third encoded audio information represents the first encoded audio information, and if the difference between the first and second loudness normalization levels exceeds the threshold, the third loudness level is derived from the second loudness normalization level. - View Dependent Claims (10, 11, 12)
-
Specification