Adapting masking thresholds for encoding a low frequency transient signal in audio data

US 7,899,677 B2
Filed: 11/24/2009
Issued: 03/01/2011
Est. Priority Date: 04/19/2005
Status: Active Grant

First Claim

Patent Images

1. A method performed by a decoder comprising:

receiving and decoding an audio bit stream;

wherein said audio bit stream was produced by an encoder;

wherein said encoder produced said audio bit stream by performing;

in response to determining that a first window of audio data does not contain a low frequency transient signal,computing a first group of masking thresholds for a first long block that corresponds to the first window of audio data; and

based on said first group of masking thresholds, encoding said first long block of audio data; and

in response to identifying a low frequency transient signal in a second window of audio data,computing a second group of masking thresholds for short blocks corresponding to the second window of audio data;

selecting one or more particular masking thresholds, from the second group of masking thresholds, for use in encoding a second long block of audio data that corresponds to the second window of audio data; and

encoding, based on the one or more particular masking thresholds, the second long block of audio data.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An improved audio coding technique encodes audio having a low frequency transient signal, using a long block, but with a set of adapted masking thresholds. Upon identifying an audio window that contains a low frequency transient signal, masking thresholds for the long block may be calculated as usual. A set of masking thresholds calculated for the 8 short blocks corresponding to the long block are calculated. The masking thresholds for low frequency critical bands are adapted based on the thresholds calculated for the short blocks, and the resulting adapted masking thresholds are used to encode the long block of audio data. The result is encoded audio with rich harmonic content and negligible coder noise resulting from the low frequency transient signal.

22 Citations

View as Search Results

11 Claims

1. A method performed by a decoder comprising:
- receiving and decoding an audio bit stream;
  
  wherein said audio bit stream was produced by an encoder;
  
  wherein said encoder produced said audio bit stream by performing;
  
  in response to determining that a first window of audio data does not contain a low frequency transient signal,computing a first group of masking thresholds for a first long block that corresponds to the first window of audio data; and
  
  based on said first group of masking thresholds, encoding said first long block of audio data; and
  
  in response to identifying a low frequency transient signal in a second window of audio data,computing a second group of masking thresholds for short blocks corresponding to the second window of audio data;
  
  selecting one or more particular masking thresholds, from the second group of masking thresholds, for use in encoding a second long block of audio data that corresponds to the second window of audio data; and
  
  encoding, based on the one or more particular masking thresholds, the second long block of audio data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein said encoder produced said audio bit stream by further performing:
    - computing a third group of masking thresholds for the second long block that corresponds to the second window of audio data; and
      
      encoding the second long block of audio data using a quantization step that is based on a masking threshold between the one or more particular masking thresholds and a masking threshold from the third group of masking thresholds.
  - 3. The method of claim 1, wherein the one or more particular masking thresholds correspond to one or more low frequency critical bands of the second long block of audio data.
  - 4. The method of claim 1,wherein the one or more particular masking thresholds correspond to a particular short block of the short blocks;
    - wherein each critical band associated with the particular short block corresponds to a particular masking threshold; and
      
      wherein said encoder produced said audio bit stream by further performing;
      
      mapping a critical band associated with the second long block to one or more particular critical bands associated with the particular short block;
      
      wherein selecting the one or more particular masking thresholds for use in encoding the second long block includes selecting one or more particular masking thresholds that correspond to the one or more particular critical bands, which map to the critical band associated with the second long block, that are associated with the particular short block; and
      
      encoding, based on the one or more particular masking thresholds that correspond to the one or more particular critical bands associated with the particular short block, the particular critical band associated with the second long block.
  - 5. The method of claim 1, wherein said encoder produced said audio bit stream by further performing:
    - wherein selecting the one or more particular masking thresholds for use in encoding the second long block includes selecting one or more minimum masking thresholds associated with the second long block, from the group of masking thresholds, for use in encoding the second long block of audio data.
  - 6. The method of claim 1, wherein said encoder produced said audio bit stream by further performing:
    - identifying the low frequency transient signal in the window of audio data.
  - 7. The method of claim 6, wherein a low frequency transient signal is a signal having a frequency that is substantially at or below a threshold frequency value, wherein the threshold frequency value is within a range from 4 kHz to 6 kHz.
  - 8. The method of claim 6, wherein said encoder produced said audio bit stream by further performing:
    - passing the audio data through a low pass filter;
      
      grouping the audio data that passes through the low pass filter into contiguous groups of samples;
      
      determining the maximum amplitude within each group of samples;
      
      comparing the maximum amplitude within a group of samples to a decayed maximum amplitude value within an adjacent previous group of samples; and
      
      if the ratio of the maximum amplitude within the group of samples and the decayed maximum amplitude value within the adjacent previous group of samples exceeds a particular threshold value, then determining that the audio data contains a low frequency transient signal.
  - 9. The method of claim 1, wherein said encoder produced said audio bit stream by further performing:
    - encoding, based on the one or more particular masking thresholds and in compliance with MPEG-4 Advanced Audio Coding standard specifications, the second long block of audio data.
  - 10. The method of claim 1, wherein the group of masking thresholds comprises respective masking thresholds for each critical band of each of the short blocks corresponding to the window of audio data.

11. A method performed by a decoder comprising:
- receiving and decoding an audio bit stream;
  
  wherein said audio bit stream was produced by an encoder;
  
  wherein said encoder produced said audio bit stream by performing;
  
  in response to determining that a first window of audio data does not contain a low frequency transient signal,computing a first group of masking thresholds for a first long block that corresponds to the first window of audio data; and
  
  based on said first group of masking thresholds, encoding said first long block of audio data; and
  
  in response to identifying a low frequency transient signal in a second window of digital audio samples,computing a second group of masking thresholds for a second long block that corresponds to the second window of audio samples;
  
  computing a third group of masking thresholds for short blocks corresponding to the second window of audio samples;
  
  selecting a final masking threshold that is between (a) one or more particular masking thresholds from the third group of masking thresholds and (b) one or more particular masking thresholds from the second group of masking thresholds; and
  
  based on said final masking threshold, encoding by a coder the second long block that corresponds to the window of audio samples.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Kuo, Shyh-Shiaw, Baumgarte, Frank
Primary Examiner(s)
Wozniak; James S
Assistant Examiner(s)
He; Jialong

Application Number

US12/624,805
Publication Number

US 20100070287A1
Time in Patent Office

462 Days
Field of Search

704/200.1, 704/219, 704500-504, 375/240
US Class Current

704/500
CPC Class Codes

G10L 19/025 Detection of transients or ...

Adapting masking thresholds for encoding a low frequency transient signal in audio data

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

22 Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

Adapting masking thresholds for encoding a low frequency transient signal in audio data

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

22 Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links