Controlling loudness of speech in signals that contain speech and other types of audio material
First Claim
1. A method for signal processing that comprises:
- receiving an audio signal;
extracting features of the audio signal;
analyzing one or more of the extracted features to perform a speech determination;
classifying segments within an interval of the audio signal as speech segments or non-speech segments based upon the speech determination, wherein each segment has a respective loudness, and the loudness or the speech segments is less than the loudness of one or more loud non-speech segments;
analyzing one or more of the extracted features of the audio signal to obtain an estimated loudness of the speech segments; and
providing an indication of the loudness of the interval of the audio signal by calculating control information from a weighted combination of the estimated loudness of the speech segments and the loudness of the non-speech segments in which the estimated loudness of the speech segments is weighted more heavily.
3 Assignments
0 Petitions
Accused Products
Abstract
Mechanisms are known that allow receivers to control loudness of speech in broadcast signals but these mechanisms require an estimate of speech loudness be inserted into the signal. Disclosed techniques provide improved estimates of loudness. According to one implementation, an indication of the loudness of an audio signal containing speech and other types of audio material is obtained by classifying segments of audio information as either speech or non-speech. The loudness of the speech segments is estimated and this estimate is used to derive the indication of loudness. The indication of loudness maybe used to control audio signal levels so that variations in loudness of speech between different programs is reduced. A preferred method for classifying speech segments is described.
185 Citations
35 Claims
-
1. A method for signal processing that comprises:
-
receiving an audio signal; extracting features of the audio signal; analyzing one or more of the extracted features to perform a speech determination; classifying segments within an interval of the audio signal as speech segments or non-speech segments based upon the speech determination, wherein each segment has a respective loudness, and the loudness or the speech segments is less than the loudness of one or more loud non-speech segments; analyzing one or more of the extracted features of the audio signal to obtain an estimated loudness of the speech segments; and providing an indication of the loudness of the interval of the audio signal by calculating control information from a weighted combination of the estimated loudness of the speech segments and the loudness of the non-speech segments in which the estimated loudness of the speech segments is weighted more heavily. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. A method for signal processing that comprises:
-
receiving an input audio signal; extracting features of the input audio signal, the extracted features representing an interval of the input of audio signal; analyzing the extracted features to perform a speech determination; classifying the interval of the audio signal as speech or non-speech based upon the speech determination, wherein each interval has a respective loudness and the loudness of the interval classified as speech is less than the loudness of one or more other segments classified as non-speech; analyzing the extracted features of the interval classified as speech to obtain an estimated loudness of the interval classified as speech; calculating a loudness control parameter, the loudness control parameter being proportional to the difference between the estimated loudness of intervals classified as speech; and adjusting an estimated loudness of intervals classified as non-speech, the adjustment being proportional to the calculated loudness control parameter. - View Dependent Claims (34, 35)
-
Specification