SCALABLE DOWNMIX DESIGN WITH FEEDBACK FOR OBJECT-BASED SURROUND CODEC
First Claim
1. A method of audio signal processing, the method comprising:
- based on spatial information for each of N audio objects, grouping a plurality of audio objects that includes the N audio objects into L clusters, where L is less than N;
mixing the plurality of audio objects into L audio streams; and
based on the spatial information and the grouping, producing metadata that indicates spatial information for each of the L audio streams,wherein a maximum value for L is based on information received from at least one of a transmission channel, a decoder, and a renderer.
1 Assignment
0 Petitions
Accused Products
Abstract
In general, techniques are described for grouping audio objects into clusters. In some examples, a device for audio signal processing comprises a cluster analysis module configured to group, based on spatial information for each of N audio objects, a plurality of audio objects that includes the N audio objects into L clusters, where L is less than N, wherein the cluster analysis module is configured to receive information from at least one of a transmission channel, a decoder, and a renderer, and wherein a maximum value for L is based on the information received. The device also comprises a downmix module configured to mix the plurality of audio objects into L audio streams, and a metadata downmix module configured to produce, based on the spatial information and the grouping, metadata that indicates spatial information for each of the L audio streams.
-
Citations
43 Claims
-
1. A method of audio signal processing, the method comprising:
-
based on spatial information for each of N audio objects, grouping a plurality of audio objects that includes the N audio objects into L clusters, where L is less than N; mixing the plurality of audio objects into L audio streams; and based on the spatial information and the grouping, producing metadata that indicates spatial information for each of the L audio streams, wherein a maximum value for L is based on information received from at least one of a transmission channel, a decoder, and a renderer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An apparatus for audio signal processing, the apparatus comprising:
-
means for receiving information from at least one of a transmission channel, a decoder, and a renderer; means for grouping, based on spatial information for each of N audio objects, a plurality of audio objects that includes the N audio objects into L clusters, where L is less than N and wherein a maximum value for L is based on the information received; means for mixing the plurality of audio objects into L audio streams; and means for producing, based on the spatial information and the grouping, metadata that indicates spatial information for each of the L audio streams. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A device for audio signal processing, the device comprising:
-
a cluster analysis module configured to group, based on spatial information for each of N audio objects, a plurality of audio objects that includes the N audio objects into L clusters, where L is less than N, wherein the cluster analysis module is configured to receive information from at least one of a transmission channel, a decoder, and a renderer, and wherein a maximum value for L is based on the information received; a downmix module configured to mix the plurality of audio objects into L audio streams, and a metadata downmix module configured to produce, based on the spatial information and the grouping, metadata that indicates spatial information for each of the L audio streams. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
-
43. A non-transitory computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors to:
- based on spatial information for each of N audio objects, group a plurality of audio objects that includes the N audio objects into L clusters, where L is less than N;
mix the plurality of audio objects into L audio streams; and based on the spatial information and the grouping, produce metadata that indicates spatial information for each of the L audio streams, wherein a maximum value for L is based on information received from at least one of a transmission channel, a decoder, and a renderer.
- based on spatial information for each of N audio objects, group a plurality of audio objects that includes the N audio objects into L clusters, where L is less than N;
Specification