Method, system and article of manufacture for processing spatial audio
First Claim
1. A method of processing audio, comprising:
- receiving, at a device, audio data corresponding to a scene;
receiving a selection distinguishing one or more enabled regions and one or more disabled regions in the scene;
determining, based on the audio data, spatial information indicative of one or more directions of one or more sound sources in the scene; and
modifying the audio data based on the selection, based on the spatial information, and based on input data identifying one or more spatial characteristics of a playback environment, wherein the modifying includes applying one or more gains based on a masking window function.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for processing directionally-encoded audio to account for spatial characteristics of a listener playback environment are disclosed. The directionally-encoded audio data includes spatial information indicative of one or more directions of sound sources in an audio scene. The audio data is modified based on input data identifying the spatial characteristics of the playback environment. The spatial characteristics may correspond to actual loudspeaker locations in the playback environment. The directionally-encoded audio may also be processed to permit focusing/defocusing on sound sources or particular directions in an audio scene. The disclosed techniques may allow a recorded audio scene to be more accurately reproduced at playback time, regardless of the output loudspeaker setup. Another advantage is that a user may dynamically configure audio data so that it better conforms to the user'"'"'s particular loudspeaker layouts and/or the user'"'"'s desired focus on particular subjects or areas in an audio scene.
9 Citations
30 Claims
-
1. A method of processing audio, comprising:
-
receiving, at a device, audio data corresponding to a scene; receiving a selection distinguishing one or more enabled regions and one or more disabled regions in the scene; determining, based on the audio data, spatial information indicative of one or more directions of one or more sound sources in the scene; and modifying the audio data based on the selection, based on the spatial information, and based on input data identifying one or more spatial characteristics of a playback environment, wherein the modifying includes applying one or more gains based on a masking window function. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An apparatus, comprising:
-
an interface configured to receive audio data corresponding to a scene; and a processor configured to; determine, based on the audio data, spatial information indicative of one or more directions of one or more sound sources in the scene; and modify the audio data based on a selection distinguishing one or more enabled regions and one or more disabled regions in the scene, based on the spatial information, and based on input data identifying one or more spatial characteristics of a playback environment, wherein the processor is configured to modify the audio data at least in part by applying one or more gains based on a masking window function. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. An apparatus, comprising:
-
means for receiving audio data corresponding to a scene; means for receiving a selection distinguishing one or more enabled regions and one or more disabled regions in the scene; means for determining, based on the audio data, spatial information indicative of one or more directions of one or more sound sources in the scene; and means for modifying the audio data based on the selection, based on the spatial information, and based on input data identifying one or more spatial characteristics of a playback environment, wherein the means for modifying is configured to modify the audio data at least in part by applying one or more gains based on a masking window function. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A non-transient computer-readable medium embodying a set of instructions executable by one or more processors, comprising:
-
code for receiving audio data corresponding to a scene; code for receiving a selection distinguishing one or more enabled regions and one or more disabled regions in the scene; code for determining, based on the audio data, spatial information indicative of one or more directions of one or more sound sources in the scene; and code for modifying the audio data based on the selection, based on the spatial information, and based on input data identifying one or more spatial characteristics of a playback environment, wherein the modifying includes applying one or more gains based on a masking window function.
-
Specification