Schemes for emphasizing talkers in a 2D or 3D conference scene
First Claim
1. A conference controller configured to place a plurality of upstream audio signals associated with a plurality of conference participants within a 2D or 3D conference scene to be rendered to a listener, wherein the conference controller is configured toset up an X-point conference scene with X different spatial talker locations within the conference scene, X being an integer greater than one;
- assign each audio signal of the plurality of upstream audio signals to a different one of the X different spatial talker locations;
provide downstream audio signals and metadata to terminals corresponding to the conference participants, the metadata indicating where a terminal is to render audio signals for each of the X different spatial talker locations;
determine a degree of activity of the plurality of upstream audio signals at a time instant;
determine a dominant one of the plurality of upstream audio signals at the time instant based on the degrees of activity of the plurality of upstream audio signals at the time instant;
assign the dominant upstream audio signal to a first of the X talker locations; and
emphasize the dominant upstream audio signal at the time instant by changing the metadata indicating where a terminal is to render audio signals for a relative position of the talker location for the dominant upstream audio signal relative to other talker locations such that the updated talker location at which the dominant upstream audio signal will be rendered is the updated talker location closest to a midline in front of a head of the listener, wherein the conference controller is implemented via at least one of firmware or hardware.
1 Assignment
0 Petitions
Accused Products
Abstract
The present document relates to methods and systems for setting up and managing two-dimensional or three-dimensional scenes for audio conferences. A conference controller (111, 175) configured to place a plurality of upstream audio signals (123, 173) associated with a plurality of conference participants within a 2D or 3D conference scene to be rendered to a listener (211) is described. The conference controller (111, 175) is configured to set up a X-point conference scene with X different spatial talker locations (212) within the conference scene; assign the plurality of upstream audio signals (123, 173) to respective ones of the talker locations (212); determine a degree of activity of the plurality of upstream audio signals (123, 173); determine a dominant one of the plurality of upstream audio signals (123, 173); and emphasize the dominant upstream audio signal (123, 173).
41 Citations
21 Claims
-
1. A conference controller configured to place a plurality of upstream audio signals associated with a plurality of conference participants within a 2D or 3D conference scene to be rendered to a listener, wherein the conference controller is configured to
set up an X-point conference scene with X different spatial talker locations within the conference scene, X being an integer greater than one; -
assign each audio signal of the plurality of upstream audio signals to a different one of the X different spatial talker locations; provide downstream audio signals and metadata to terminals corresponding to the conference participants, the metadata indicating where a terminal is to render audio signals for each of the X different spatial talker locations; determine a degree of activity of the plurality of upstream audio signals at a time instant; determine a dominant one of the plurality of upstream audio signals at the time instant based on the degrees of activity of the plurality of upstream audio signals at the time instant; assign the dominant upstream audio signal to a first of the X talker locations; and emphasize the dominant upstream audio signal at the time instant by changing the metadata indicating where a terminal is to render audio signals for a relative position of the talker location for the dominant upstream audio signal relative to other talker locations such that the updated talker location at which the dominant upstream audio signal will be rendered is the updated talker location closest to a midline in front of a head of the listener, wherein the conference controller is implemented via at least one of firmware or hardware. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
a generatrix of the cone and the midline form an angle which is smaller than or equal to a pre-determined maximum cone angle; and the conference controller is configured to rotate the conference scene such that all updated talker locations are positioned within the cone around the midline.
-
-
9. The conference controller of claim 1, wherein the conference controller is configured to reduce an angular distance between adjacent talker locations, in order to determine the updated talker locations.
-
10. The conference controller of claim 1, wherein the conference controller is configured to
determine a different new dominant one of the plurality of upstream audio signals at a second time instant after the time instant; -
de-emphasize the former dominant upstream audio signal at the second time instant; and emphasize the new dominant upstream audio signal at the second time instant.
-
-
11. The conference controller of claim 1, wherein the conference controller is configured to classify the X spatial talker locations into a plurality of clusters;
- wherein a first of the plurality of clusters comprises at least two spatial talker locations;
wherein the spatial talker locations comprised within the first cluster are directly adjacent.
- wherein a first of the plurality of clusters comprises at least two spatial talker locations;
-
12. The conference controller of claim 11, wherein the conference controller is configured to classify the X spatial talker locations into the plurality of clusters dependent upon classification metadata.
-
13. The conference controller of claim 12, wherein the classification metadata comprises at least one of:
-
an identifier associated with an electronic means of communication of a conference participant; and an identifier associated with a physical location of a conference participant.
-
-
14. The conference controller of claim 12, wherein the conference controller is configured to extract the classification metadata from one or more of the plurality of upstream audio signals.
-
15. The conference controller of claim 12, wherein the conference controller is configured to facilitate input of the classification metadata by a conference participant.
-
16. The conference controller of claim 8, wherein the conference controller is configured to calculate the X-point conference scene with X different spatial talker locations such that the X talker locations are positioned within the cone around the midline in front of the head of the listener.
-
17. The conference controller of claim 1, wherein the conference controller is configured to select the X-point conference scene with X different spatial talker locations from a set of pre-determined X-point conference scenes with X different pre-determined spatial talker locations.
-
18. The conference controller of claim 1, wherein the conference controller is configured to emphasize the dominant upstream audio signal at the time instant by modifying a height of the first talker location relative to the others of the X spatial talker locations.
-
19. An audio conferencing system comprising:
-
a plurality of talker terminals configured to generate a plurality of upstream audio signals associated with a plurality of conference participants, respectively; a conference controller configured to; receive the plurality of upstream audio signals; assign each audio signal of the plurality of upstream audio signals to a different one of X different spatial talker locations within a 2D or 3D conference scene; provide downstream audio signals and metadata to terminals corresponding to the conference participants, the metadata indicating where a terminal is to render audio signals for each of the X different spatial talker locations; determine a dominant one of the plurality of upstream audio signals; assign the dominant upstream audio signal to a first of the X talker locations; and emphasize a dominant one of the plurality of upstream audio signals by changing the metadata indicating where a terminal is to render audio signals for a relative position of the talker location for the dominant upstream audio signal relative to other talker locations by re-assigning the dominant upstream audio signal to a center location within the 2D or 3D conference scene;
wherein the center location corresponds to the talker location closest to a midline in front of a head of the listener; anda listener terminal configured to render the dominant upstream audio signal to a listener according to the metadata.
-
-
20. A method for placing a plurality of upstream audio signals associated with a plurality of conference participants within a 2D or 3D conference scene to be rendered to a listener, wherein the method comprises:
-
setting up an X-point conference scene with X different spatial talker locations within the conference scene, X being an integer greater than one; assigning each audio signal of the plurality of upstream audio signals to a different one of the X different spatial talker locations; providing downstream audio signals and metadata to terminals corresponding to the conference participants, the metadata indicating where a terminal is to render audio signals for each of the X different spatial talker locations; determining a degree of activity of the plurality of upstream audio signals, at a time instant; determining a dominant one of the plurality of upstream audio signals at the time instant based on the degrees of activity of the plurality of upstream audio signals at the time instant; assigning the dominant upstream audio signal to a first of the X talker locations; and emphasizing the dominant upstream audio signal at the time instant by changing the metadata indicating where a terminal is to render audio signals for a relative position of the talker location for the dominant upstream audio signal relative to other talker locations such that the updated talker location at which the dominant upstream audio signal will be rendered is the updated talker location closest to a midline in front of a head of the listener. - View Dependent Claims (21)
-
Specification