FLEXIBLE FREQUENCY AND TIME PARTITIONING IN PERCEPTUAL TRANSFORM CODING OF AUDIO

US 20080312759A1
Filed: 06/15/2007
Published: 12/18/2008
Est. Priority Date: 06/15/2007
Status: Active Grant

First Claim

Patent Images

1. A method of compressively encoding audio, the method comprising:

applying a frequency transform to blocks of input audio data to produce sets of spectral coefficients;

quantizing the sets of spectral coefficients;

encoding quantized spectral coefficients in a base frequency region of the sets up to an upper bound frequency position in a compressed audio bit stream;

determining a band structure for partitioning spectral holes and an extension region above the upper bound frequency position into bands for vector quantization coding, where the spectral holes are runs of consecutive spectral coefficients in the base frequency region were quantized to a zero value;

wherein said determining a band structure for partitioning in the case of spectral holes comprises;

detecting any spectral holes in the base frequency region having a width larger than a minimum hole size threshold; and

for a detected spectral hole, determining a number of bands having a band size not exceeding a maximum band size threshold and that evenly divide the detected spectral hole; and

encoding spectral coefficients at the frequency positions of the spectral holes and the extension region using vector quantization coding in the compressed audio bit stream.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An audio encoder/decoder performs band partitioning for vector quantization encoding of spectral holes and missing high frequencies that result from quantization when encoding at low bit rates. The encoder/decoder determines a band structure for spectral holes based on two threshold parameters: a minimum hole size threshold and a maximum band size threshold. Spectral holes wider than the minimum hole size threshold are partitioned evenly into bands not exceeding the maximum band size threshold in size. Such hole filling bands are configured up to a preset number of hole filling bands. The bands for missing high frequencies are then configured by dividing the high frequency region into bands having binary-increasing, linearly-increasing or arbitrarily-configured band sizes up to a maximum overall number of bands.

Citations

11 Claims

1. A method of compressively encoding audio, the method comprising:
- applying a frequency transform to blocks of input audio data to produce sets of spectral coefficients;
  
  quantizing the sets of spectral coefficients;
  
  encoding quantized spectral coefficients in a base frequency region of the sets up to an upper bound frequency position in a compressed audio bit stream;
  
  determining a band structure for partitioning spectral holes and an extension region above the upper bound frequency position into bands for vector quantization coding, where the spectral holes are runs of consecutive spectral coefficients in the base frequency region were quantized to a zero value;
  
  wherein said determining a band structure for partitioning in the case of spectral holes comprises;
  
  detecting any spectral holes in the base frequency region having a width larger than a minimum hole size threshold; and
  
  for a detected spectral hole, determining a number of bands having a band size not exceeding a maximum band size threshold and that evenly divide the detected spectral hole; and
  
  encoding spectral coefficients at the frequency positions of the spectral holes and the extension region using vector quantization coding in the compressed audio bit stream.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 wherein said determining a band structure for partitioning in the case of spectral holes further comprises configuring bands in the band structure in which to partition spectral holes up to a predetermined maximum number of spectral hole filling bands.
  - 3. The method of claim 1 wherein said determining a band structure for partitioning in the case of the extension region comprises:
    - dividing the extension region into a desired number of bands.
  - 4. The method of claim 3 wherein said determining a band structure for partitioning in the case of the extension region further comprises:
    - dividing the extension region into bands having a binary-increasing ratio, linearly-increasing ratio, or arbitrary configuration of band sizes.
  - 5. The method of claim 1 further comprising choosing a band partitioning mode from among a hole filling mode in which the band structure partitions the spectral holes only, an extension mode in which the band structure partitions the extension region only, and a hole filling and extension mode in which the band structure partitions the spectral holes and extension region.
  - 6. The method of claim 5 wherein said choosing the band partitioning mode further comprises choosing from among modes further comprising an overlay mode in which the band structure partitions the spectral holes and extension region, and wherein said determining the band structure when the overlay mode is chosen comprises dividing the spectral holes and extension region into a desired number of bands having a binary-increasing ratio, linearly-increasing ratio, or arbitrary configuration of band sizes.
  - 7. A method of decoding the compressed audio bit stream of claim 1 comprising:
    - decoding the spectral coefficients of the base region from the compressed audio bit stream;
      
      determining the band structure of the spectral holes and extension region;
      
      decoding the spectral coefficients of the spectral holes and extension region;
      
      applying inverse quantization to the spectral coefficients of the based region and inverse vector quantization to the spectral coefficients of the spectral holes and extension region for the determined band structure;
      
      combining the spectral coefficients of the base region, spectral holes and extension region; and
      
      applying an inverse transform to the combined spectral coefficients to produce reconstructed audio.

8. A method of compressively encoding audio, the method comprising:
- applying a frequency transform having a first window size to input audio data to produce first sets of spectral coefficients;
  
  applying a frequency transform having a second window size to the input audio data to produce second sets of spectral coefficients;
  
  quantizing at least a first spectrum region of the first sets of spectral coefficients;
  
  encoding the quantized spectral coefficients in the first spectrum region into a compressed audio bit stream; and
  
  performing vector quantization coding of the second sets of spectral coefficients in a second spectrum region into the compressed audio bit stream.
- View Dependent Claims (9, 10, 11)
- - 9. The method of claim 8 further comprising:
    - performing vector quantization coding of the first sets of spectral coefficients in a third spectrum region into the compressed audio bit stream.
  - 10. A method of decoding the compressed audio bit stream encoded by the method of claim 8, the method comprising:
    - decoding the first sets of spectral coefficients from the compressed audio bit stream;
      
      inverse quantizing the first sets of spectral coefficients;
      
      applying an inverse frequency transform having the first window size to the first sets of spectral coefficients to form a first reconstructed audio stream;
      
      performing vector quantization decoding of the second sets of spectral coefficients;
      
      applying an inverse frequency transform having the second window size to the second sets of spectral coefficients to form a second reconstructed audio stream; and
      
      combining the first and second reconstructed audio streams.
  - 11. A method of decoding the compressed audio bit stream encoded by the method of claim 9, the method comprising:
    - decoding the first sets of spectral coefficients from the compressed audio bit stream;
      
      inverse quantizing the first sets of spectral coefficients;
      
      performing vector quantization decoding of the third sets of spectral coefficients;
      
      applying an inverse frequency transform having the first window size to the first and third sets of spectral coefficients to form a first reconstructed audio stream;
      
      performing vector quantization decoding of the second sets of spectral coefficients;
      
      applying an inverse frequency transform having the second window size to the second sets of spectral coefficients to form a second reconstructed audio stream; and
      
      combining the first and second reconstructed audio streams.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Chen, Wei-Ge, Koishida, Kazuhito, Mehrotra, Sanjeev

Granted Patent

US 7,761,290 B2
Time in Patent Office

Days
Field of Search
US Class Current

700/94
CPC Class Codes

G10L 19/0208 Subband vocoders

G10L 19/032 Quantisation or dequantisat...

FLEXIBLE FREQUENCY AND TIME PARTITIONING IN PERCEPTUAL TRANSFORM CODING OF AUDIO

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

FLEXIBLE FREQUENCY AND TIME PARTITIONING IN PERCEPTUAL TRANSFORM CODING OF AUDIO

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links