Layered approach to spatial audio coding

US 9,460,729 B2
Filed: 09/10/2013
Issued: 10/04/2016
Est. Priority Date: 09/21/2012
Status: Active Grant

First Claim

Patent Images

1. An audio encoding system, comprising:

a spatial analyzer configured to receive a plurality of audio signals, and to output, based thereon, decomposition parameters;

an adaptive rotation stage configured to receive said plurality of audio signals and to output a plurality of rotated audio signals obtained by an adaptive energy-compacting orthogonal transformation, wherein quantitative properties of the transformation are determined by the decomposition parameters, and wherein the plurality of rotated audio signals and the decomposition parameters are discretely decodable into a first sound field representation; and

an analysis stage configured to output, based on said plurality of audio signals, a time-variable gain profile comprising at least one frequency-variable component for attenuating non-voice content when applied to at least one of the plurality of rotated audio signals, at least one of de-rotated versions of the plurality of rotated audio signal, or another sound field representation of at least one of the plurality of rotated audio signals, wherein the analysis stage is further adapted to output, based on said plurality of audio signals, spatial parameters adapted for use in spatial synthesis of a first rotated audio signal,wherein the audio encoding system is operable to suspend output of a set of signals selected from the group comprising;

said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal; and

said spatial parameters, said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention provides a layered audio coding format with a monophonic layer and at least one sound field layer. A plurality of audio signals is decomposed, in accordance with decomposition parameters controlling the quantitative properties of an orthogonal energy-compacting transform, into rotated audio signals. Further, a time-variable gain profile specifying constructively how the rotated audio signals may be processed to attenuate undesired audio content is derived. The monophonic layer may comprise one of the rotated signals and the gain profile. The sound field layer may comprise the rotated signals and the decomposition parameters. In one embodiment, the gain profile comprises a cleaning gain profile with the main purpose of eliminating non-speech components and/or noise. The gain profile may also comprise mutually independent broadband gains. Because signals in the audio coding format can be mixed with a limited computational effort, the invention may advantageously be applied in a tele-conferencing application.

Citations

18 Claims

1. An audio encoding system, comprising:
- a spatial analyzer configured to receive a plurality of audio signals, and to output, based thereon, decomposition parameters;
  
  an adaptive rotation stage configured to receive said plurality of audio signals and to output a plurality of rotated audio signals obtained by an adaptive energy-compacting orthogonal transformation, wherein quantitative properties of the transformation are determined by the decomposition parameters, and wherein the plurality of rotated audio signals and the decomposition parameters are discretely decodable into a first sound field representation; and
  
  an analysis stage configured to output, based on said plurality of audio signals, a time-variable gain profile comprising at least one frequency-variable component for attenuating non-voice content when applied to at least one of the plurality of rotated audio signals, at least one of de-rotated versions of the plurality of rotated audio signal, or another sound field representation of at least one of the plurality of rotated audio signals, wherein the analysis stage is further adapted to output, based on said plurality of audio signals, spatial parameters adapted for use in spatial synthesis of a first rotated audio signal,wherein the audio encoding system is operable to suspend output of a set of signals selected from the group comprising;
  
  said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal; and
  
  said spatial parameters, said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The audio encoding system of claim 1, wherein the adaptive energy-compacting orthogonal transformation is constrained, so as to limit variations with respect to time.
  - 3. The audio encoding system of claim 1, wherein a first rotated audio signal of said plurality of rotated audio signals is a dominant signal of said plurality of rotated audio signals.
  - 4. The audio encoding system of claim 1, wherein the analysis stage is further adapted to output, based on said plurality of audio signals, the spatial parameters adapted for use in spatial synthesis of a first rotated audio signal, or a modified version of the first rotated audio signal, into a second sound field representation different from the first sound field representation, wherein said first rotated audio signal is comprised in said plurality of rotated audio signals.
  - 5. The audio encoding system of claim 4, wherein a time resolution of said spatial parameters is relatively lower than a time resolution of said decomposition parameters.
  - 6. The audio encoding system of claim 4, wherein the analysis stage is adapted to perform an auditory scene analysis based on said plurality of audio signals, and to output said spatial parameters based on the auditory scene analysis.
  - 7. The audio encoding system of claim 1, wherein the spatial analyzer is adapted to periodically estimate covariances of said plurality of audio signals and, optionally, to performing eigen-analysis based thereon.
  - 8. The audio encoding system of claim 7, wherein the spatial analyzer is adapted to perform temporal smoothing of successive estimated covariance values.
  - 9. The audio encoding system of claim 1, further comprising:
    - a time-invariant pre-conditioning stage configured to output said plurality of audio signals based on one or more input audio signals.
  - 10. The audio encoding system of claim 9, wherein said one or more input audio signals, based on which the audio signals are provided, constitute an equal number of input audio signals obtainable by three angularly distributed directive transducers.
  - 11. The audio encoding system of claim 1, further comprising:
    - a multiplexer configured to multiplex at least the decomposition parameters, the plurality of rotated audio signals and the gain profile into a bitstream.
  - 12. The audio encoding system of claim 1, further comprising:
    - a multiplexer adapted to output a multilayered signal having;
      
      a monophonic layer comprising the first rotated audio signal and the gain profile;
      
      a first sound field layer comprising the first rotated audio signal, the gain profile and the spatial parameters; and
      
      a second sound field layer comprising the plurality of rotated audio signals, the gain profile and the decomposition parameters.
  - 13. The audio encoding system of claim 1, wherein the spatial analyzer is further configured to quantize said decomposition parameters before supplying them to the adaptive rotation stage.
  - 14. The audio encoding system of claim 1, wherein said plurality of rotated audio signals comprises at least three signals.

15. An audio encoding method comprising:
- determining decomposition parameters on the basis of a plurality of audio signals;
  
  rotating the plurality of audio signals into a plurality of rotated audio signals using an adaptive energy-compacting orthogonal transform, wherein quantitative properties of the orthogonal transformation are determined by the decomposition parameters;
  
  determining, based on said plurality of audio signals, a time-variable gain profile comprising at least one frequency-variable component for attenuating non-voice content when applied to at least one of the plurality of rotated audio signals, at least one of de-rotated versions of the plurality of rotated audio signal, or another sound field representation of at least one of the plurality of rotated audio signals;
  
  outputting the decomposition parameters, the plurality of rotated audio signals, the time-variable gain profile and based on said plurality of audio signals, spatial parameters adapted for use in spatial synthesis of a first rotated audio signal; and
  
  suspending output of a set of signals selected from the group comprising;
  
  said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal; and
  
  said spatial parameters, said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal.

16. A sound field audio decoding system for providing a sound field representation of a plurality of audio signals based on a plurality of rotated audio signals, the sound field representation obtainable from said plurality of audio signals using an adaptive energy-compacting orthogonal transformation, a time-variable gain profile comprising at least one frequency-variable component attenuating non-voice content when applied to at least one of the plurality of rotated audio signals, at least one of de-rotated versions of the plurality of rotated audio signal, or another sound field representation of at least one of the plurality of rotated audio signals, and decomposition parameters,the sound field audio decoding system comprising:
- a cleaning stage adapted to receive the time-variable gain profile and the plurality of rotated audio signals and to obtain and output a plurality of modified rotated audio signals by applying the time-variable gain profile to the plurality of rotated audio signals; and
  
  an adaptive rotation inversion stage configured to discretely decode said plurality of modified rotated audio signals into said sound field representation based on said decomposition parameters.
- View Dependent Claims (17, 18)
- - 17. The sound field audio decoding system of claim 16, further comprisinga demultiplexer configured to obtain the plurality of rotated audio signals, the time-variable gain profile and the decomposition parameters from one bitstream comprising the decomposition parameters, the time-variable gain profile, spatial parameters adapted for use in spatial synthesis of a first rotated audio signal of the plurality of audio signals, and a the plurality of rotated audio signals.
  - 18. The sound field audio decoding system of claim 16, further comprising a downscaling section operable to receive the time-variable gain profile and to supply the cleaning stage with a downscaled gain profile, causing the cleaning stage to output a plurality of modified rotated audio signals being more similar to the plurality of rotated audio signals than is a plurality of modified rotated audio signals obtainable by applying a non-downscaled version of the gain profile to the plurality of rotated audio signals.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dolby International AB (Dolby Laboratories Incorporated), Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Original Assignee
Dolby International AB (Dolby Laboratories Incorporated), Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Inventors
Dickins, Glenn, Purnhagen, Heiko, Samuelsson, Leif Jonas
Primary Examiner(s)
AZAD, ABUL K

Application Number

US14/427,589
Publication Number

US 20150248889A1
Time in Patent Office

1,120 Days
Field of Search

704/200.1, 704500-504
US Class Current

1/1
CPC Class Codes

G10L 19/008   Multichannel audio signal c...

G10L 19/012   Comfort noise or silence co...

G10L 19/02   using spectral analysis, e....

G10L 19/0208   Subband vocoders

G10L 19/032   Quantisation or dequantisat...

G10L 19/22   Mode decision, i.e. based o...

G10L 19/24   Variable rate codecs, e.g. ...

G10L 21/02   Speech enhancement, e.g. no...

G10L 21/0208   Noise filtering

G10L 21/0216   characterised by the method...

H04M 3/56   Arrangements for connecting...

Layered approach to spatial audio coding

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Layered approach to spatial audio coding

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links