System and method for adaptive audio signal generation, coding and rendering
First Claim
1. A system for processing audio signals, comprising an authoring component configured to:
- receive a plurality of audio signals;
generate an adaptive audio mix comprising a plurality of monophonic audio streams and one or more metadata sets associated with each of the plurality of monophonic audio streams and specifying a playback location of a respective monophonic audio stream, wherein at least some of the plurality of monophonic audio streams are identified as channel-based audio and wherein the others of the plurality of monophonic audio streams are identified as object-based audio, and wherein the playback location of the channel-based audio comprises speaker designations of speakers in a speaker array, and the playback location of the object-based audio comprises a location in three-dimensional space relative to a playback environment containing the speaker array; and
further wherein a first metadata set is applied to one or more of the plurality of monophonic audio streams for a first condition of the playback environment, and a second metadata set is applied to the one or more of the plurality of monophonic audio streams for a second condition of the playback environment; and
encapsulate the plurality of monophonic audio streams and the at least two metadata sets in a bitstream for transmission to a rendering system configured to render the plurality of monophonic audio streams to a plurality of speaker feeds corresponding to speakers in the playback environment in accordance with the at least two metadata sets based on a condition of the playback environment.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer'"'"'s intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
-
Citations
17 Claims
-
1. A system for processing audio signals, comprising an authoring component configured to:
-
receive a plurality of audio signals; generate an adaptive audio mix comprising a plurality of monophonic audio streams and one or more metadata sets associated with each of the plurality of monophonic audio streams and specifying a playback location of a respective monophonic audio stream, wherein at least some of the plurality of monophonic audio streams are identified as channel-based audio and wherein the others of the plurality of monophonic audio streams are identified as object-based audio, and wherein the playback location of the channel-based audio comprises speaker designations of speakers in a speaker array, and the playback location of the object-based audio comprises a location in three-dimensional space relative to a playback environment containing the speaker array; and
further wherein a first metadata set is applied to one or more of the plurality of monophonic audio streams for a first condition of the playback environment, and a second metadata set is applied to the one or more of the plurality of monophonic audio streams for a second condition of the playback environment; andencapsulate the plurality of monophonic audio streams and the at least two metadata sets in a bitstream for transmission to a rendering system configured to render the plurality of monophonic audio streams to a plurality of speaker feeds corresponding to speakers in the playback environment in accordance with the at least two metadata sets based on a condition of the playback environment. - View Dependent Claims (2, 3)
-
-
4. A system for processing audio signals, comprising a rendering system configured to:
-
receive a bitstream encapsulating a plurality of monophonic audio streams and at least two metadata sets in a bitstream from an authoring component configured to receive a plurality of audio signals, and generate a plurality of monophonic audio streams and one or more metadata sets associated with each of the plurality of monophonic audio streams and specifying a playback location of a respective monophonic audio stream, wherein at least some of the plurality of monophonic audio streams are identified as channel-based audio and wherein the others of the plurality of monophonic audio streams are identified as object-based audio, and wherein the playback location of the channel-based audio comprises speaker designations of speakers in a speaker array, and the playback location of the object-based audio comprises a location in three-dimensional space relative to a playback environment containing the speaker array; and
further wherein a first metadata set is applied to one or more of the plurality of monophonic audio streams for a first condition of the playback environment, and a second metadata set is applied to the one or more plurality of monophonic audio streams for a second condition of the playback environment; andrender the plurality of monophonic audio streams to a plurality of speaker feeds corresponding to speakers in the playback environment in accordance with the at least two metadata sets based on a condition of the playback environment. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method of authoring audio signals for rendering, comprising:
-
receiving a plurality of audio signals; generating an adaptive audio mix comprising a plurality of monophonic audio streams and one or more metadata sets associated with each of the plurality of monophonic audio streams and specifying a playback location of a respective monophonic audio stream, wherein at least some of the plurality of monophonic audio streams are identified as channel-based audio and wherein the others of the plurality of monophonic audio streams are identified as object-based audio, and wherein the playback location of the channel-based audio comprises speaker designations of speakers in a speaker array, and the playback location of the object-based audio comprises a location in three-dimensional space relative to a playback environment containing the speaker array; and
further wherein a first metadata set is applied to one or more of the plurality of monophonic audio streams for a first condition of the playback environment, and a second metadata set is applied to the one or more of the plurality of monophonic audio streams for a second condition of the playback environment; andencapsulating the plurality of monophonic audio streams and the one or more metadata sets in a bitstream for transmission to a rendering system configured to render the plurality of monophonic audio streams to a plurality of speaker feeds corresponding to speakers in the playback environment in accordance with the at least two metadata sets based on a condition of the playback environment. - View Dependent Claims (14)
-
-
15. A method of rendering audio signals, comprising:
-
receiving a bitstream encapsulating a plurality of monophonic audio streams and at least two metadata sets in a bitstream from an authoring component configured to receive a plurality of audio signals, and generate a plurality of monophonic audio streams and one or more metadata sets associated with each of the plurality of monophonic audio streams and specifying a playback location of a respective monophonic audio stream, wherein at least some of the plurality of monophonic audio streams are identified as channel-based audio and wherein the others of the plurality of monophonic audio streams are identified as object-based audio, and wherein the playback location of the channel-based audio comprises speaker designations of speakers in a speaker array, and the playback location of the object-based audio comprises a location in three-dimensional space relative to a playback environment containing the speaker array; and
further wherein a first metadata set is applied to one or more of the plurality of monophonic audio streams for a first condition of the playback environment, and a second metadata set is applied to the one or more plurality of monophonic audio streams for a second condition of the playback environment; andrendering the plurality of monophonic audio streams to a plurality of speaker feeds corresponding to speakers in the playback environment in accordance with the at least two metadata sets based on a condition of the playback environment. - View Dependent Claims (16, 17)
-
Specification