Layered approach to spatial audio coding
First Claim
1. An audio encoding system, comprising:
- a spatial analyzer configured to receive a plurality of audio signals, and to output, based thereon, decomposition parameters;
an adaptive rotation stage configured to receive said plurality of audio signals and to output a plurality of rotated audio signals obtained by an adaptive energy-compacting orthogonal transformation, wherein quantitative properties of the transformation are determined by the decomposition parameters, and wherein the plurality of rotated audio signals and the decomposition parameters are discretely decodable into a first sound field representation; and
an analysis stage configured to output, based on said plurality of audio signals, a time-variable gain profile comprising at least one frequency-variable component for attenuating non-voice content when applied to at least one of the plurality of rotated audio signals, at least one of de-rotated versions of the plurality of rotated audio signal, or another sound field representation of at least one of the plurality of rotated audio signals, wherein the analysis stage is further adapted to output, based on said plurality of audio signals, spatial parameters adapted for use in spatial synthesis of a first rotated audio signal,wherein the audio encoding system is operable to suspend output of a set of signals selected from the group comprising;
said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal; and
said spatial parameters, said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention provides a layered audio coding format with a monophonic layer and at least one sound field layer. A plurality of audio signals is decomposed, in accordance with decomposition parameters controlling the quantitative properties of an orthogonal energy-compacting transform, into rotated audio signals. Further, a time-variable gain profile specifying constructively how the rotated audio signals may be processed to attenuate undesired audio content is derived. The monophonic layer may comprise one of the rotated signals and the gain profile. The sound field layer may comprise the rotated signals and the decomposition parameters. In one embodiment, the gain profile comprises a cleaning gain profile with the main purpose of eliminating non-speech components and/or noise. The gain profile may also comprise mutually independent broadband gains. Because signals in the audio coding format can be mixed with a limited computational effort, the invention may advantageously be applied in a tele-conferencing application.
-
Citations
18 Claims
-
1. An audio encoding system, comprising:
-
a spatial analyzer configured to receive a plurality of audio signals, and to output, based thereon, decomposition parameters; an adaptive rotation stage configured to receive said plurality of audio signals and to output a plurality of rotated audio signals obtained by an adaptive energy-compacting orthogonal transformation, wherein quantitative properties of the transformation are determined by the decomposition parameters, and wherein the plurality of rotated audio signals and the decomposition parameters are discretely decodable into a first sound field representation; and an analysis stage configured to output, based on said plurality of audio signals, a time-variable gain profile comprising at least one frequency-variable component for attenuating non-voice content when applied to at least one of the plurality of rotated audio signals, at least one of de-rotated versions of the plurality of rotated audio signal, or another sound field representation of at least one of the plurality of rotated audio signals, wherein the analysis stage is further adapted to output, based on said plurality of audio signals, spatial parameters adapted for use in spatial synthesis of a first rotated audio signal, wherein the audio encoding system is operable to suspend output of a set of signals selected from the group comprising; said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal; and said spatial parameters, said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An audio encoding method comprising:
-
determining decomposition parameters on the basis of a plurality of audio signals; rotating the plurality of audio signals into a plurality of rotated audio signals using an adaptive energy-compacting orthogonal transform, wherein quantitative properties of the orthogonal transformation are determined by the decomposition parameters; determining, based on said plurality of audio signals, a time-variable gain profile comprising at least one frequency-variable component for attenuating non-voice content when applied to at least one of the plurality of rotated audio signals, at least one of de-rotated versions of the plurality of rotated audio signal, or another sound field representation of at least one of the plurality of rotated audio signals; outputting the decomposition parameters, the plurality of rotated audio signals, the time-variable gain profile and based on said plurality of audio signals, spatial parameters adapted for use in spatial synthesis of a first rotated audio signal; and suspending output of a set of signals selected from the group comprising; said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal; and said spatial parameters, said decomposition parameters and all of said plurality of rotated audio signals, except said first rotated audio signal.
-
-
16. A sound field audio decoding system for providing a sound field representation of a plurality of audio signals based on a plurality of rotated audio signals, the sound field representation obtainable from said plurality of audio signals using an adaptive energy-compacting orthogonal transformation, a time-variable gain profile comprising at least one frequency-variable component attenuating non-voice content when applied to at least one of the plurality of rotated audio signals, at least one of de-rotated versions of the plurality of rotated audio signal, or another sound field representation of at least one of the plurality of rotated audio signals, and decomposition parameters,
the sound field audio decoding system comprising: -
a cleaning stage adapted to receive the time-variable gain profile and the plurality of rotated audio signals and to obtain and output a plurality of modified rotated audio signals by applying the time-variable gain profile to the plurality of rotated audio signals; and an adaptive rotation inversion stage configured to discretely decode said plurality of modified rotated audio signals into said sound field representation based on said decomposition parameters. - View Dependent Claims (17, 18)
-
Specification