Adapting masking thresholds for encoding a low frequency transient signal in audio data
First Claim
1. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, cause the one or more processors to perform:
- in response to determining that a first window of audio data does not contain a low frequency transient signal,computing a first group of masking thresholds for a first long block that corresponds to the first window of audio data; and
based on said first group of masking thresholds, encoding said first long block of audio data;
in response to identifying a low frequency transient signal in a second window of audio data,computing a second group of masking thresholds for short blocks corresponding to the second window of audio data;
selecting one or more particular masking thresholds, from the second group of masking thresholds, for use in encoding a second long block of audio data that corresponds to the second window of audio data; and
encoding, based on the one or more particular masking thresholds, the second long block of audio data.
2 Assignments
0 Petitions
Accused Products
Abstract
An improved audio coding technique encodes audio having a low frequency transient signal, using a long block, but with a set of adapted masking thresholds. Upon identifying an audio window that contains a low frequency transient signal, masking thresholds for the long block may be calculated as usual. A set of masking thresholds calculated for the 8 short blocks corresponding to the long block are calculated. The masking thresholds for low frequency critical bands are adapted based on the thresholds calculated for the short blocks, and the resulting adapted masking thresholds are used to encode the long block of audio data. The result is encoded audio with rich harmonic content and negligible coder noise resulting from the low frequency transient signal.
279 Citations
22 Claims
-
1. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, cause the one or more processors to perform:
-
in response to determining that a first window of audio data does not contain a low frequency transient signal, computing a first group of masking thresholds for a first long block that corresponds to the first window of audio data; and based on said first group of masking thresholds, encoding said first long block of audio data; in response to identifying a low frequency transient signal in a second window of audio data, computing a second group of masking thresholds for short blocks corresponding to the second window of audio data; selecting one or more particular masking thresholds, from the second group of masking thresholds, for use in encoding a second long block of audio data that corresponds to the second window of audio data; and encoding, based on the one or more particular masking thresholds, the second long block of audio data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-implemented method for determining a masking threshold for use in encoding audio data, the method comprising:
-
in response to determining that a first window of audio data does not contain a low frequency transient signal, computing a first group of masking thresholds for a first long block that corresponds to the first window of audio data; and based on said first group of masking thresholds, encoding said first long block of audio data; in response to identifying a low frequency transient signal in a second window of audio data, computing a second group of masking thresholds for short blocks corresponding to the second window of audio data; selecting one or more particular masking thresholds, from the second group of masking thresholds, for use in encoding a second long block of audio data that corresponds to the second window of audio data; encoding, based on the one or more particular masking thresholds, the second long block of audio data; wherein the computer-implemented method is performed by one or more computing devices. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, cause the one or more processors to perform:
-
in response to determining that a first window of audio data does not contain a low frequency transient signal, computing a first group of masking thresholds for a first long block that corresponds to the first window of audio data; and based on said first group of masking thresholds, encoding said first long block of audio data; in response to identifying a low frequency transient signal in a second window of digital audio samples, computing a second group of masking thresholds for a second long block that corresponds to the second window of audio samples; computing a third group of masking thresholds for short blocks corresponding to the second window of audio samples; selecting a final masking threshold that is between (a) one or more particular masking thresholds from the third group of masking thresholds and (b) one or more particular masking thresholds from the second group of masking thresholds; and based on said final masking threshold, encoding by a coder the second long block that corresponds to the window of audio samples.
-
-
22. A computer-implemented method comprising:
-
in response to determining that a first window of audio data does not contain a low frequency transient signal, computing a first group of masking thresholds for a first long block that corresponds to the first window of audio data; and based on said first group of masking thresholds, encoding said first long block of audio data; in response to identifying a low frequency transient signal in a second window of digital audio samples, computing a second group of masking thresholds for a second long block that corresponds to the second window of audio samples; computing a third group of masking thresholds for short blocks corresponding to the second window of audio samples; selecting a final masking threshold that is between (a) one or more particular masking thresholds from the third group of masking thresholds and (b) one or more particular masking thresholds from the second group of masking thresholds; and based on said final masking threshold, encoding by a coder the second long block that corresponds to the window of audio samples; wherein the computer-implemented method is performed by one or more computing devices.
-
Specification