Optimized virtual scene layout for spatial meeting playback
First Claim
1. A method for processing audio data, the method comprising:
- receiving audio data corresponding to a recording of a conference involving a plurality of conference participants, the audio data including at least one of;
(a) audio data from multiple endpoints, the audio data for each of the multiple endpoints having been recorded separately or (b) audio data from a single endpoint corresponding to multiple conference participants and including spatial information for each conference participant of the multiple conference participants;
analyzing the audio data to determine conversational dynamics data that includes at least one data type selected from a list of data types consisting of;
data indicating the frequency and duration of conference participant speech;
data indicating instances of conference participant doubletalk during which at least two conference participants are speaking simultaneously; and
data indicating instances of conference participant conversations;
applying the conversational dynamics data as one or more variables of a spatial optimization cost function of a vector describing a virtual conference participant position for each of the conference participants in a virtual acoustic space;
applying an optimization technique to the spatial optimization cost function to determine a locally optimal solution; and
assigning the virtual conference participant positions in the virtual acoustic space based, at least in part, on the locally optimal solution.
2 Assignments
0 Petitions
Accused Products
Abstract
Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations involve receiving or determining conversational dynamics data. One or more variables of a cost function may be based, at least in part, on the conversational dynamics data. The cost function may be a spatial optimization cost function of a vector describing a virtual conference participant position for each of the conference participants in a virtual acoustic space. The virtual acoustic space may be determined relative to a listener'"'"'s head. The virtual conference participant positions may be assigned according to a solution of the cost function.
44 Citations
20 Claims
-
1. A method for processing audio data, the method comprising:
-
receiving audio data corresponding to a recording of a conference involving a plurality of conference participants, the audio data including at least one of;
(a) audio data from multiple endpoints, the audio data for each of the multiple endpoints having been recorded separately or (b) audio data from a single endpoint corresponding to multiple conference participants and including spatial information for each conference participant of the multiple conference participants;analyzing the audio data to determine conversational dynamics data that includes at least one data type selected from a list of data types consisting of;
data indicating the frequency and duration of conference participant speech;
data indicating instances of conference participant doubletalk during which at least two conference participants are speaking simultaneously; and
data indicating instances of conference participant conversations;applying the conversational dynamics data as one or more variables of a spatial optimization cost function of a vector describing a virtual conference participant position for each of the conference participants in a virtual acoustic space; applying an optimization technique to the spatial optimization cost function to determine a locally optimal solution; and assigning the virtual conference participant positions in the virtual acoustic space based, at least in part, on the locally optimal solution. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory medium having software stored thereon, the software including instructions for processing audio data by controlling at least one device for:
-
receiving audio data corresponding to a recording of a conference involving a plurality of conference participants, the audio data including at least one of;
(a) audio data from multiple endpoints, the audio data for each of the multiple endpoints having been recorded separately or (b) audio data from a single endpoint corresponding to multiple conference participants and including spatial information for each conference participant of the multiple conference participants;analyzing the audio data to determine conversational dynamics data that includes at least one data type selected from a list of data types consisting of;
data indicating the frequency and duration of conference participant speech;
data indicating instances of conference participant doubletalk during which at least two conference participants are speaking simultaneously; and
data indicating instances of conference participant conversations;applying the conversational dynamics data as one or more variables of a spatial optimization cost function of a vector describing a virtual conference participant position for each of the conference participants in a virtual acoustic space; applying an optimization technique to the spatial optimization cost function to determine a locally optimal solution; and assigning the virtual conference participant positions in the virtual acoustic space based, at least in part, on the locally optimal solution.
-
-
16. An apparatus, comprising:
-
an interface system; and a control system capable of; receiving, via the interface system, audio data corresponding to a recording of a conference involving a plurality of conference participants, the audio data including at least one of;
(a) audio data from multiple endpoints, the audio data for each of the multiple endpoints having been recorded separately or (b) audio data from a single endpoint corresponding to multiple conference participants and including spatial information for each conference participant of the multiple conference participants;analyzing the audio data to determine conversational dynamics data that includes at least one data type selected from a list of data types consisting of;
data indicating the frequency and duration of conference participant speech;
data indicating instances of conference participant doubletalk during which at least two conference participants are speaking simultaneously; and
data indicating instances of conference participant conversations;applying the conversational dynamics data as one or more variables of a spatial optimization cost function of a vector describing a virtual conference participant position for each of the conference participants in a virtual acoustic space; applying an optimization technique to the spatial optimization cost function to determine a locally optimal solution; and assigning the virtual conference participant positions in the virtual acoustic space based, at least in part, on the locally optimal solution. - View Dependent Claims (17, 18, 19, 20)
-
Specification