Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal

US 10,395,661 B2
Filed: 09/05/2017
Issued: 08/27/2019
Est. Priority Date: 03/09/2015
Status: Active Grant

First Claim

Patent Images

1. Audio encoder for encoding a multichannel signal, comprising:

a linear prediction domain encoder;

a frequency domain encoder; and

a controller for switching between the linear prediction domain encoder and the frequency domain encoder,wherein the linear prediction domain encoder comprises a downmixer for downmixing the multichannel signal to acquire a downmix signal, a linear prediction domain core encoder for encoding the downmix signal and a first joint multichannel encoder for generating first multichannel information from the multichannel signal,wherein the frequency domain encoder comprises a second joint multichannel encoder for encoding second multichannel information from the multichannel signal, wherein the second joint multichannel encoder is different from the first joint multichannel encoder, andwherein the controller is configured such that a portion of the multichannel signal is represented either by an encoded frame of the linear prediction domain encoder or by an encoded frame of the frequency domain encoder,wherein the linear prediction domain encoder comprises an ACELP processor and a TCX processor, wherein the ACELP processor is configured to operate on a downsampled downmix signal and wherein a time domain bandwidth extension processor is configured to parametrically encode a band of a portion of the downmix signal removed from the ACELP input signal by a third downsampling, and wherein the TCX processor is configured to operate on the downmix signal not downsampled or downsampled by a degree smaller than the downsampling for the ACELP processor, the TCX processor comprising a first time-frequency converter, a first parameter generator for generating a parametric representation of a first set of bands and a first quantizer encoder for generating a set of quantized encoder spectral lines for a second set of bands, orwherein the controller is configured to switch within a current frame of the multichannel signal from using the frequency domain encoder for encoding a previous frame to the linear prediction domain encoder for decoding an upcoming frame, wherein the first joint multichannel encoder is configured to calculate synthetic multichannel parameters from the multichannel signal for the current frame, and wherein the second joint multichannel encoder is configured to weight the multichannel signal using a stop window.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A schematic block diagram of an audio encoder for encoding a multichannel audio signal is shown. The audio encoder includes a linear prediction domain encoder, a frequency domain encoder, and a controller for switching between the linear prediction domain encoder and the frequency domain encoder. The controller is configured such that a portion of the multichannel signal is represented either by an encoded frame of the linear prediction domain encoder or by an encoded frame of the frequency domain encoder. The linear prediction domain encoder includes a downmixer for downmixing the multichannel signal to obtain a downmixed signal. The linear prediction domain encoder further includes a linear prediction domain core encoder for encoding the downmix signal and furthermore, the linear prediction domain encoder includes a first joint multichannel encoder for generating first multichannel information from the multichannel signal.

22 Citations

View as Search Results

24 Claims

1. Audio encoder for encoding a multichannel signal, comprising:
- a linear prediction domain encoder;
  
  a frequency domain encoder; and
  
  a controller for switching between the linear prediction domain encoder and the frequency domain encoder,wherein the linear prediction domain encoder comprises a downmixer for downmixing the multichannel signal to acquire a downmix signal, a linear prediction domain core encoder for encoding the downmix signal and a first joint multichannel encoder for generating first multichannel information from the multichannel signal,wherein the frequency domain encoder comprises a second joint multichannel encoder for encoding second multichannel information from the multichannel signal, wherein the second joint multichannel encoder is different from the first joint multichannel encoder, andwherein the controller is configured such that a portion of the multichannel signal is represented either by an encoded frame of the linear prediction domain encoder or by an encoded frame of the frequency domain encoder,wherein the linear prediction domain encoder comprises an ACELP processor and a TCX processor, wherein the ACELP processor is configured to operate on a downsampled downmix signal and wherein a time domain bandwidth extension processor is configured to parametrically encode a band of a portion of the downmix signal removed from the ACELP input signal by a third downsampling, and wherein the TCX processor is configured to operate on the downmix signal not downsampled or downsampled by a degree smaller than the downsampling for the ACELP processor, the TCX processor comprising a first time-frequency converter, a first parameter generator for generating a parametric representation of a first set of bands and a first quantizer encoder for generating a set of quantized encoder spectral lines for a second set of bands, orwherein the controller is configured to switch within a current frame of the multichannel signal from using the frequency domain encoder for encoding a previous frame to the linear prediction domain encoder for decoding an upcoming frame, wherein the first joint multichannel encoder is configured to calculate synthetic multichannel parameters from the multichannel signal for the current frame, and wherein the second joint multichannel encoder is configured to weight the multichannel signal using a stop window.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. Audio encoder of claim 1, wherein the first joint multichannel encoder comprises a first time-frequency converter, wherein the second joint multichannel encoder comprises a second time-frequency converter, and wherein the first and the second time-frequency converters are different from each other.
  - 3. Audio encoder of claim 1, wherein the first joint multichannel encoder is a parametric joint multichannel encoder;
    - orwherein the second joint multichannel encoder is a waveform-preserving joint multichannel encoder.
  - 4. Audio encoder according to claim 3,wherein the parametric joint multichannel encoder comprises a stereo prediction coder, a parametric stereo encoder or a rotation-based parametric stereo encoder, orwherein the waveform-preserving joint multichannel encoder comprises a band-selective switch mid/side or left/right stereo coder.
  - 5. Audio encoder of claim 1, wherein the frequency domain encoder comprises a second time-frequency converter for converting a first channel of the multichannel signal and a second channel of the multichannel signal into a spectral representation, a second parameter generator for generating a parametric representation of a second set of bands and a second quantizer encoder for generating a quantized and encoded representation of a first set of bands.
  - 6. Audio encoder of claim 1,wherein the linear prediction domain encoder comprises an ACELP processor with a time-domain bandwidth extension and a TCX processor with an MDCT operation and an intelligent gap filling functionality, orwherein the frequency domain encoder comprises an MDCT operation for the first channel and the second channel and an AAC operation and an intelligent gap filling functionality, orwherein the first joint multichannel encoder is configured to operate in such a way that multichannel information for a full bandwidth of the multichannel signal is derived.
  - 7. Audio encoder of claim 1, further comprising:
    - a linear prediction domain decoder for decoding the downmix signal to acquire an encoded and decoded downmix signal; and
      
      a multichannel residual coder for calculating and encoding a multichannel residual signal using the encoded and decoded downmix signal representing an error between a decoded multichannel representation using the first multichannel information and the multichannel signal before downmixing.
  - 8. Audio encoder of claim 7,wherein the downmix signal has a low band and a high band, wherein the linear prediction domain encoder is configured to apply a bandwidth extension processing for parametrically encoding the high band, wherein the linear prediction domain decoder is configured to acquire, as the encoded and decoded downmix signal only a low band signal representing the low band of the downmix signal, and wherein the encoded multichannel residual signal only has frequency within the low band of the multichannel signal before downmixing.
  - 9. Audio encoder of claim 7,wherein the multichannel residual coder comprises:
    - a joint multichannel decoder for generating a decoded multichannel signal using the first multichannel information and the encoded and decoded downmixed signal; and
      
      a difference processor for forming a difference between the decoded multichannel signal and the multichannel signal before downmixing to acquire the multichannel residual signal.
  - 10. Audio encoder of claim 1,wherein the downmixer is configured to convert the multichannel signal into a spectral representation and where the downmixing is performed using the spectral representation or using a time domain representation, andwherein the first joint multichannel encoder is configured to use the spectral representation to generate separate first multichannel information for individual bands of the spectral representation.
  - 11. Audio encoder of claim 1, wherein multichannel means two or more channels.

12. Audio decoder for decoding an encoded audio signal, comprising:
- a linear prediction domain decoder;
  
  a frequency domain decoder;
  
  a first joint multichannel decoder for generating a first multichannel representation using an output of the linear prediction domain decoder and using a first multichannel information;
  
  a second joint multichannel decoder for generating a second multichannel representation using an output of the frequency domain decoder and a second multichannel information; and
  
  a first combiner for combining the first multichannel representation and the second multichannel representation to acquire a decoded audio signal,wherein the second joint multichannel decoder is different from the first joint multichannel decoder,wherein the encoded audio signal comprises a multichannel residual signal for the output of the linear prediction domain decoder, wherein the first joint multichannel decoder is configured to use the multichannel residual signal for generating the first multichannel representation, wherein the multichannel residual signal has a lower bandwidth than the first multichannel representation, and wherein the first joint multichannel decoder is configured to reconstruct an intermediate first multichannel representation using the first multichannel information and to add the multichannel residual signal to the intermediate first multichannel representation, orwherein the audio decoder is configured to switch within a current frame of the encoded audio signal from using the frequency domain decoder for decoding a previous frame to the linear prediction domain decoder for decoding an upcoming frame, wherein the first combiner is configured to calculate a synthetic mid-signal from the second multichannel representation of the current frame, wherein the first joint multichannel decoder is configured to generate the first multichannel representation using the synthetic mid-signal and the first multichannel information, and wherein the first combiner is configured to combine the first multichannel representation and the second multichannel representation to acquire a decoded current frame of the decoded audio signal, orwherein the audio decoder is configured to switch within a current frame of the encoded audio signal from using the linear prediction domain decoder for decoding a previous frame to the frequency domain decoder for decoding an upcoming frame, wherein the first joint multichannel decoder is configured to calculate a synthetic multichannel audio signal from a decoded mono signal of the linear prediction domain decoder for the current frame using multichannel information of the previous frame, wherein the second joint multichannel decoder is configured to calculate the second multichannel representation for the current frame and to weight the second multichannel representation using a start window, and wherein the first combiner is configured to combine the synthetic multichannel audio signal and the weighted second multichannel representation to acquire a decoded current frame of the decoded audio signal.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
- - 13. Audio decoder of claim 12,wherein the first joint multichannel decoder is a parametric joint multichannel decoder and wherein the second joint multichannel decoder is a waveform-preserving joint multichannel decoder,wherein the first joint multichannel decoder is configured to operate based on a complex prediction, a parametric stereo operation, or a rotation operation, andwherein the second joint multichannel decoder is configured to apply a band-selective switch to mid/side or left/right stereo decoding algorithm.
  - 14. Audio decoder of claim 12, wherein the linear prediction domain decoder comprises:
    - an ACELP decoder, a low band synthesizer, an upsampler, a time domain bandwidth extension processor or a second combiner for combining an upsampled signal and a bandwidth-extended signal;
      
      a TCX decoder and an intelligent gap filling (IGF) processor;
      
      a full band synthesis processor for combining an output of the second combiner and a TCX decoder and the IGF processor;
      
      orwherein a cross-path is provided for initializing the low band synthesizer using information derived by a low band spectrum-time conversion from the TCX decoder and the IGF processor.
  - 15. Audio decoder of claim 12,wherein the first joint multichannel decoder comprises a time-frequency converter for converting the output of the linear prediction domain decoder into a spectral representation;
    - an upmixer controlled by the first multichannel information operating on the spectral representation; and
      
      a frequency-time converter for converting an upmix result into a time representation period.
  - 16. Audio decoder of claim 12,wherein the second joint multichannel decoder is configuredto use, as an input, a spectral representation acquired by the frequency domain decoder, the spectral representation comprising, at least for a plurality of bands, a first channel signal and a second channel signal, andto apply a joint multichannel operation to the plurality of bands of the first channel signal and the second channel signal and to convert a result of the joint multichannel operation into a time representation to acquire the second multichannel representation.
  - 17. Audio decoder of claim 16, wherein the second multichannel information is a mask indicating, for individual bands, a left/right or mid/side joint multichannel coding, and wherein the joint multichannel operation is a mid/side to left/right converting operation for converting bands indicated by the mask from the mid/side representation to a left/right representation.
  - 18. Audio decoder of claim 12,wherein the encoded audio signal comprises a multichannel residual signal for the output of the linear prediction domain decoder, andwherein the first joint multichannel decoder is configured to use the multichannel residual signal for generating the first multichannel representation.
  - 19. Audio decoder of claim 15,wherein the time-frequency converter comprises a complex operation or an oversampled operation, andwherein the frequency domain decoder comprises an IMDCT operation or a critically-sampled operation.
  - 20. Audio decoder of claim 12, wherein multichannel means two or more channels.

21. Method of encoding a multichannel signal comprising:
- performing a linear prediction domain encoding;
  
  performing a frequency domain encoding; and
  
  switching between the linear prediction domain encoding and the frequency domain encoding,wherein the linear prediction domain encoding comprises downmixing the multichannel signal to acquire a downmix signal, a linear prediction domain core encoding the downmix signal and a first joint multichannel encoding generating first multichannel information from the multichannel signal,wherein the frequency domain encoding comprises a second joint multichannel encoding generating second multichannel information from the multichannel signal, wherein the second joint multichannel encoding is different from the first joint multichannel encoding, andwherein the switching is performed such that a portion of the multichannel signal is represented either by an encoded frame of the linear prediction domain encoding or by an encoded frame of the frequency domain encoding,wherein the performing a linear prediction domain encoding comprises ACELP processing and TCX processing, wherein the ACELP processing comprises operating on a downsampled downmix signal and wherein a time domain bandwidth extension processing comprises parametrically encoding a band of a portion of the downmix signal removed from the ACELP input signal by a third downsampling, and wherein the TCX processing comprises operating on the downmix signal not downsampled or downsampled by a degree smaller than the downsampling for the ACELP processing, the TCX processing comprising first time-frequency converting, generating a parametric representation of a first set of bands and generating a set of quantized encoder spectral lines for a second set of bands, orwherein the switching comprises switching within a current frame of the multichannel signal from using the frequency domain encoding for encoding a previous frame to the linear prediction domain encoding for decoding an upcoming frame, wherein the first joint multichannel encoding comprises calculating synthetic multichannel parameters from the multichannel signal for the current frame, and wherein the second joint multichannel encoding comprises weighting the multichannel signal using a stop window.

22. Method of decoding an encoded audio signal, comprising:
- linear prediction domain decoding;
  
  frequency domain decoding;
  
  first joint multichannel decoding generating a first multichannel representation using an output of the linear prediction domain decoding and using a first multichannel information;
  
  second joint multichannel decoding generating a second multichannel representation using an output of the frequency domain decoding and a second multichannel information; and
  
  combining the first multichannel representation and the second multichannel representation to acquire a decoded audio signal,wherein the second joint multichannel decoding is different from the first joint multichannel decoding,wherein the encoded audio signal comprises a multichannel residual signal for the output of the linear prediction domain decoding, wherein the first joint multichannel decoding comprises using the multichannel residual signal for generating the first multichannel representation, wherein the multichannel residual signal has a lower bandwidth than the first multichannel representation, and wherein the first joint multichannel decoding comprises reconstructing an intermediate first multichannel representation using the first multichannel information and adding the multichannel residual signal to the intermediate first multichannel representation, orwherein the decoding the encoded audio signal comprises switching within a current frame of the encoded audio signal from using the frequency domain decoding for decoding a previous frame to the linear prediction domain decoding for decoding an upcoming frame, wherein the combining comprises calculating a synthetic mid-signal from the second multichannel representation of the current frame, wherein the first joint multichannel decoding comprises generating the first multichannel representation using the synthetic mid-signal and the first multichannel information, and wherein the combining comprises combining the first multichannel representation and the second multichannel representation to acquire a decoded current frame of the decoded audio signal, orwherein the decoding the encoded audio signal comprises switching within a current frame of the encoded audio signal from using the linear prediction domain decoding for decoding a previous frame to the frequency domain decoding for decoding an upcoming frame, wherein the first joint multichannel decoding comprises calculating a synthetic multichannel audio signal from a decoded mono signal of the linear prediction domain decoding for the current frame using multichannel information of the previous frame, wherein the second joint multichannel decoding is configured to calculate the second multichannel representation for the current frame and to weight the second multichannel representation using a start window, and wherein the combining comprises combining the synthetic multichannel audio signal and the weighted second multichannel representation to acquire a decoded current frame of the decoded audio signal.

23. A non-transitory digital storage medium having a computer program stored thereon to perform, when said computer program is run by a computer, the method of encoding a multichannel signal, the method comprising:
- performing a linear prediction domain encoding;
  
  performing a frequency domain encoding; and
  
  switching between the linear prediction domain encoding and the frequency domain encoding,wherein the linear prediction domain encoding comprises downmixing the multichannel signal to acquire a downmix signal, a linear prediction domain core encoding the downmix signal and a first joint multichannel encoding generating first multichannel information from the multichannel signal,wherein the frequency domain encoding comprises a second joint multichannel encoding generating second multichannel information from the multichannel signal, wherein the second joint multichannel encoding is different from the first joint multichannel encoding, andwherein the switching is performed such that a portion of the multichannel signal is represented either by an encoded frame of the linear prediction domain encoding or by an encoded frame of the frequency domain encoding,wherein the performing a linear prediction domain encoding comprises ACELP processing and TCX processing,wherein the ACELP processing comprises operating on a downsampled downmix signal and wherein a time domain bandwidth extension processing comprises parametrically encoding a band of a portion of the downmix signal removed from the ACELP input signal by a third downsampling, and wherein the TCX processing comprises operating on the downmix signal not downsampled or downsampled by a degree smaller than the downsampling for the ACELP processing, the TCX processing comprising first time-frequency converting, generating a parametric representation of a first set of bands and generating a set of quantized encoder spectral lines for a second set of bands, orwherein the switching comprises switching within a current frame of the multichannel signal from using the frequency domain encoding for encoding a previous frame to the linear prediction domain encoding for decoding an upcoming frame, wherein the first joint multichannel encoding comprises calculating synthetic multichannel parameters from the multichannel signal for the current frame, and wherein the second joint multichannel encoding comprises weighting the multichannel signal using a stop window.

24. A non-transitory digital storage medium having a computer program stored thereon to perform, when said computer program is run by a computer, the method of decoding an encoded audio signal, the method comprising:
- linear prediction domain decoding;
  
  frequency domain decoding;
  
  first joint multichannel decoding generating a first multichannel representation using an output of the linear prediction domain decoding and using a first multichannel information;
  
  second joint multichannel decoding generating a second multichannel representation using an output of the frequency domain decoding and a second multichannel information; and
  
  combining the first multichannel representation and the second multichannel representation to acquire a decoded audio signal,wherein the second joint multichannel decoding is different from the first joint multichannel decoding,wherein the encoded audio signal comprises a multichannel residual signal for the output of the linear prediction domain decoding, wherein the first joint multichannel decoding comprises using the multichannel residual signal for generating the first multichannel representation, wherein the multichannel residual signal has a lower bandwidth than the first multichannel representation, and wherein the first joint multichannel decoding comprises reconstructing an intermediate first multichannel representation using the first multichannel information and adding the multichannel residual signal to the intermediate first multichannel representation, orwherein the decoding the encoded audio signal comprises switching within a current frame of the encoded audio signal from using the frequency domain decoding for decoding a previous frame to the linear prediction domain decoding for decoding an upcoming frame, wherein the combining comprises calculating a synthetic mid-signal from the second multichannel representation of the current frame, wherein the first joint multichannel decoding comprises generating the first multichannel representation using the synthetic mid-signal and the first multichannel information, and wherein the combining comprises combining the first multichannel representation and the second multichannel representation to acquire a decoded current frame of the decoded audio signal, orwherein the decoding the encoded audio signal comprises switching within a current frame of the encoded audio signal from using the linear prediction domain decoding for decoding a previous frame to the frequency domain decoding for decoding an upcoming frame, wherein the first joint multichannel decoding comprises calculating a synthetic multichannel audio signal from a decoded mono signal of the linear prediction domain decoding for the current frame using multichannel information of the previous frame, wherein the second joint multichannel decoding is configured to calculate the second multichannel representation for the current frame and to weight the second multichannel representation using a start window, and wherein the combining comprises combining the synthetic multichannel audio signal and the weighted second multichannel representation to acquire a decoded current frame of the decoded audio signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forsching E.V.
Original Assignee
Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forsching E.V.
Inventors
Disch, Sascha, Fuchs, Guillaume, Ravelli, Emmanuel, Neukam, Christian, Schmidt, Konstantin, Benndorf, Conrad, Niedermeier, Andreas, Schubert, Benjamin, Geiger, Ralf
Primary Examiner(s)
Matar, Ahmad F.
Assistant Examiner(s)
Diaz, Sabrina

Application Number

US15/695,424
Publication Number

US 20170365263A1
Time in Patent Office

721 Days
Field of Search

381 80
US Class Current
CPC Class Codes

G10L 19/008   Multichannel audio signal c...

G10L 19/02   using spectral analysis, e....

G10L 19/032   Quantisation or dequantisat...

G10L 19/04   using predictive techniques

G10L 19/08   Determination or coding of ...

G10L 19/13   Residual excited linear pre...

G10L 19/18   Vocoders using multiple modes

G10L 21/038   using band spreading techni...

Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

22 Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

22 Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links