Quality improvement techniques in an audio encoder

US 8,805,696 B2
Filed: 10/07/2013
Issued: 08/12/2014
Est. Priority Date: 12/14/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A computer system comprising a processing unit and memory, wherein the computer system implements an audio encoder adapted to perform a method comprising:

receiving audio in multiple channels;

encoding the audio to produce encoded audio information, including;

truncating the audio in a second set of one or more spectral bands higher in frequency than a first set of one or more spectral bands, leaving the audio in the first set of one or more spectral bands;

encoding the audio in the first set of one or more spectral bands as quantized spectral information, including;

selectively performing a multi-channel transform between the multiple channels for the audio in the first set of one or more spectral bands;

performing perceptual weighting for the audio in the first set of one or more spectral bands;

performing entropy encoding for the audio in the first set of one or more spectral bands;

encoding the audio in the second set of one or more spectral bands as parameters instead of quantized spectral information, wherein the parameters at least in part indicate forms of patterns to be generated during decoding to represent the audio in the second set of one or more spectral bands, the patterns that represent the audio in the second set of one or more spectral bands to be combined with results of decoding the quantized spectral information for the audio in the first set of one or more spectral bands, and wherein the encoding the audio in the second set of one or more spectral bands comprises;

when the multiple channels are independently coded, using a different array of noise parameters for each of the multiple independently coded channels, wherein the different array of noise parameters for each of the multiple independently coded channels includes one or more noise parameters, each of the one or more noise parameters indicating a noise parameter value for a frequency band of one or more of the spectral bands in the second set over a time window of the independently coded channel; and

when the multiple channels are jointly coded, using an array of noise parameters for the joint coding channel, wherein the array of noise parameters for the joint coding channel includes one or more noise parameters, each of the one or more noise parameters indicating a noise parameter value for a frequency band of one or more of the spectral bands in the second set over a time window of the joint coding channel; and

outputting the encoded audio information in a bit stream.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An audio encoder implements multi-channel coding decision, band truncation, multi-channel rematrixing, and header reduction techniques to improve quality and coding efficiency. In the multi-channel coding decision technique, the audio encoder dynamically selects between joint and independent coding of a multi-channel audio signal via an open-loop decision based upon (a) energy separation between the coding channels, and (b) the disparity between excitation patterns of the separate input channels. In the band truncation technique, the audio encoder performs open-loop band truncation at a cut-off frequency based on a target perceptual quality measure. In multi-channel rematrixing technique, the audio encoder suppresses certain coefficients of a difference channel by scaling according to a scale factor, which is based on current average levels of perceptual quality, current rate control buffer fullness, coding mode, and the amount of channel separation in the source. In the header reduction technique, the audio encoder selectively modifies the quantization step size of zeroed quantization bands so as to encode in fewer frame header bits.

266 Citations

24 Claims

1. A computer system comprising a processing unit and memory, wherein the computer system implements an audio encoder adapted to perform a method comprising:
- receiving audio in multiple channels;
  
  encoding the audio to produce encoded audio information, including;
  
  truncating the audio in a second set of one or more spectral bands higher in frequency than a first set of one or more spectral bands, leaving the audio in the first set of one or more spectral bands;
  
  encoding the audio in the first set of one or more spectral bands as quantized spectral information, including;
  
  selectively performing a multi-channel transform between the multiple channels for the audio in the first set of one or more spectral bands;
  
  performing perceptual weighting for the audio in the first set of one or more spectral bands;
  
  performing entropy encoding for the audio in the first set of one or more spectral bands;
  
  encoding the audio in the second set of one or more spectral bands as parameters instead of quantized spectral information, wherein the parameters at least in part indicate forms of patterns to be generated during decoding to represent the audio in the second set of one or more spectral bands, the patterns that represent the audio in the second set of one or more spectral bands to be combined with results of decoding the quantized spectral information for the audio in the first set of one or more spectral bands, and wherein the encoding the audio in the second set of one or more spectral bands comprises;
  
  when the multiple channels are independently coded, using a different array of noise parameters for each of the multiple independently coded channels, wherein the different array of noise parameters for each of the multiple independently coded channels includes one or more noise parameters, each of the one or more noise parameters indicating a noise parameter value for a frequency band of one or more of the spectral bands in the second set over a time window of the independently coded channel; and
  
  when the multiple channels are jointly coded, using an array of noise parameters for the joint coding channel, wherein the array of noise parameters for the joint coding channel includes one or more noise parameters, each of the one or more noise parameters indicating a noise parameter value for a frequency band of one or more of the spectral bands in the second set over a time window of the joint coding channel; and
  
  outputting the encoded audio information in a bit stream.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The computer system of claim 1 wherein the truncation includes dropping spectral coefficients in the second set of one or more spectral bands after a windowed overlapped frequency transform during the encoding of the audio in the first set of one or more spectral bands.
  - 3. The computer system of claim 1 wherein the encoded audio information includes, for a frame of the audio in multiple channels:
    - information that indicates the second set of one or more spectral bands are encoded as the parameters instead of quantized spectral information.
  - 4. The computer system of claim 3 wherein the parameters and the information that indicates the second set of one or more spectral bands change on a frame-by-frame basis.
  - 5. The computer system of claim 1 wherein the second set of one or more spectral bands are high bands above a threshold and the first set of one or more spectral bands are low bands below the threshold.
  - 6. The computer system of claim 1 wherein the perceptual weighting of the audio in the first set of one or more spectral bands accounts for the truncation of the audio in the second set of one or more spectral bands.
  - 7. The computer system of claim 1 wherein the encoding the audio in the second set of one or more spectral bands further comprises:
    - mapping the second set of one or more spectral bands to positions of the frequency bands for the noise parameters, respectively.
  - 8. The computer system of claim 1 wherein the method further comprises identifying a cutoff frequency between the first set of spectral bands and the second set of spectral bands based on perceptual audio quality for the audio.
  - 9. The computer system of claim 8 wherein the perceptual audio quality is measured in terms of noise to excitation ratio or measured in terms of noise to mask ratio.
  - 10. The computer system of claim 1 wherein the truncating the audio comprises:
    - performing first band truncation on the audio at a first cut-off frequency based on a target audio quality; and
      
      performing second band truncation on the audio at a second cut-off frequency based on achieved audio quality after encoding of the audio after the first band truncation.

11. One or more computer-readable media storing instructions for causing a processing unit programmed thereby to perform a method of audio decoding, the one or more computer-readable media being selected from a group consisting of volatile memory, non-volatile memory, magnetic storage media and optical storage media, the method comprising:
- receiving audio in multiple channels;
  
  encoding the audio to produce encoded audio information, including;
  
  truncating the audio in a second set of one or more spectral bands higher in frequency than a first set of one or more spectral bands, leaving the audio in the first set of one or more spectral bands;
  
  encoding the audio in the first set of one or more spectral bands as quantized spectral information, including;
  
  selectively performing a multi-channel transform between the multiple channels for the audio in the first set of one or more spectral bands;
  
  performing perceptual weighting for the audio in the first set of one or more spectral bands;
  
  performing entropy encoding for the audio in the first set of one or more spectral bands;
  
  encoding the audio in the second set of one or more spectral bands as parameters instead of quantized spectral information, wherein the parameters at least in part indicate forms of patterns to be generated during decoding to represent the audio in the second set of one or more spectral bands, the patterns that represent the audio in the second set of one or more spectral bands to be combined with results of decoding the quantized spectral information for the audio in the first set of one or more spectral bands, and wherein the encoding the audio in the second set of one or more spectral bands comprises;
  
  when the multiple channels are independently coded, using a different array of noise parameters for each of the multiple independently coded channels, wherein the different array of noise parameters for each of the multiple independently coded channels includes one or more noise parameters, each of the one or more noise parameters indicating a noise parameter value for a frequency band of one or more of the spectral bands in the second set over a time window of the independently coded channel; and
  
  when the multiple channels are jointly coded, using an array of noise parameters for the joint coding channel, wherein the array of noise parameters for the joint coding channel includes one or more noise parameters, each of the one or more noise parameters indicating a noise parameter value for a frequency band of one or more of the spectral bands in the second set over a time window of the joint coding channel; and
  
  outputting the encoded audio information in a bit stream.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- - 12. The one or more computer-readable media of claim 11 wherein the truncation includes dropping spectral coefficients in the second set of one or more spectral bands after a windowed overlapped frequency transform during the encoding of the audio in the first set of one or more spectral bands.
  - 13. The one or more computer-readable media of claim 11 wherein the encoded audio information includes, for a frame of the audio in multiple channels:
    - information that indicates the second set of one or more spectral bands are encoded as the parameters instead of quantized spectral information.
  - 14. The one or more computer-readable media of claim 13 wherein the parameters and the information that indicates the second set of one or more spectral bands change on a frame-by-frame basis.
  - 15. The one or more computer-readable media of claim 11 wherein the second set of one or more spectral bands are high bands above a threshold and the first set of one or more spectral bands are low bands below the threshold.
  - 16. The one or more computer-readable media of claim 11 wherein the perceptual weighting of the audio in the first set of one or more spectral bands accounts for the truncation of the audio in the second set of one or more spectral bands.
  - 17. The one or more computer-readable media of claim 11 wherein the encoding the audio in the second set of one or more spectral bands further comprises:
    - mapping the second set of one or more spectral bands to positions of the frequency bands for the noise parameters, respectively.
  - 18. The one or more computer-readable media of claim 11 wherein the method further comprises identifying a cutoff frequency between the first set of spectral bands and the second set of spectral bands based on perceptual audio quality for the audio.
  - 19. The computer system one or more computer-readable media of claim 11 wherein the truncating the audio comprises:
    - performing first band truncation on the audio at a first cut-off frequency based on a target audio quality; and
      
      performing second band truncation on the audio at a second cut-off frequency based on achieved audio quality after encoding of the audio after the first band truncation.

20. A computer system comprising a processing unit and memory, wherein the computer system implements an audio encoder adapted to perform a method comprising:
- receiving audio in multiple channels;
  
  encoding the audio to produce encoded audio information, including;
  
  identifying a cutoff frequency between a first set of spectral bands and a second set of spectral bands higher in frequency than the first set of one or more spectral bands;
  
  truncating the audio in the second set of one or more spectral bands, leaving the audio in the first set of one or more spectral bands;
  
  encoding the audio in the first set of one or more spectral bands as quantized spectral information, including;
  
  selectively performing a multi-channel transform between the multiple channels for the audio in the first set of one or more spectral bands;
  
  performing perceptual weighting for the audio in the first set of one or more spectral bands;
  
  performing entropy encoding for the audio in the first set of one or more spectral bands;
  
  encoding the audio in the second set of one or more spectral bands as parameters instead of quantized spectral information, wherein the parameters at least in part indicate forms of patterns to be generated during decoding to represent the audio in the second set of one or more spectral bands, the patterns that represent the audio in the second set of one or more spectral bands to be combined with results of decoding the quantized spectral information for the audio in the first set of one or more spectral bands, and wherein the encoding the audio in the second set of one or more spectral bands comprises;
  
  when the multiple channels are independently coded, using a different array of noise parameters for each of the multiple independently coded channels, wherein the different array of noise parameters for each of the multiple independently coded channels includes one or more noise parameters, each of the one or more noise parameters indicating a noise parameter value for a frequency band of one or more of the spectral bands in the second set over a time window of the independently coded channel; and
  
  when the multiple channels are jointly coded, using an array of noise parameters for the joint coding channel, wherein the array of noise parameters for the joint coding channel includes one or more noise parameters, each of the one or more noise parameters indicating a noise parameter value for a frequency band of one or more of the spectral bands in the second set over a time window of the joint coding channel; and
  
  outputting the encoded audio information in a bit stream.
- View Dependent Claims (21, 22, 23, 24)
- - 21. The computer system of claim 20 wherein the truncation includes dropping spectral coefficients in the second set of one or more spectral bands after a windowed overlapped frequency transform during the encoding of the audio in the first set of one or more spectral bands.
  - 22. The computer system of claim 20 wherein the encoded audio information includes, for a frame of the audio in multiple channels:
    - information that indicates the second set of one or more spectral bands are encoded as the parameters instead of quantized spectral information.
  - 23. The computer system of claim 22 wherein the parameters and the information that indicates the second set of one or more spectral bands change on a frame-by-frame basis.
  - 24. The computer system of claim 20 wherein the encoding the audio in the second set of one or more spectral bands further comprises:
    - mapping the second set of one or more spectral bands to positions of the frequency bands for the noise parameters, respectively.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Chen, Wei-Ge, Thumpudi, Naveen, Lee, Ming-Chieh
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
Serrou, Abdelali

Application Number

US14/047,957
Publication Number

US 20140039884A1
Time in Patent Office

309 Days
Field of Search

704500-504, 704/201, 704/229, 704/230, 704/E19.005
US Class Current

704/500
CPC Class Codes

G10L 19/002   Dynamic bit allocation for ...

G10L 19/008   Multichannel audio signal c...

G10L 19/02   using spectral analysis, e....

Quality improvement techniques in an audio encoder

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

266 Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Quality improvement techniques in an audio encoder

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

266 Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links