Virtual conference room for voice conferencing
First Claim
1. A packet voice conferencing method comprising:
- concurrently receiving a first packet voice data stream from a first conferencing endpoint and a multiple channel second packet voice data stream from a second conferencing endpoint;
mapping the voice data from the first packet voice data stream to a first set of presentation mixing channels in a manner that simulates that voice data as originating in a first sector of a presentation sound field;
mapping the voice data from the second packet voice data stream to a second set of presentation mixing channels in a manner that simulates that voice data as originating in a second sector of a presentation sound field, the second sector substantially non-overlapping the first sector; and
mixing each channel from the first set of presentation mixing channels with the corresponding channel from the second set of presentation mixing channels to form a first set of mixed channels.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method are disclosed for packet voice conferencing. The system and method divide a conferencing presentation sound field into sectors, and allocate one or more sectors to each conferencing endpoint. At some point between capture and playout, the voice data from each endpoint is mapped into its designated sector or sectors. Thereafter, when the voice data from a plurality of participants from multiple endpoints is combined, a listener can identify a unique apparent location within the presentation sound field for each participant. The system allows a conference participant to increase their comprehension when multiple participants speak simultaneously, as well as alleviate confusion as to who is speaking at any given time.
-
Citations
44 Claims
-
1. A packet voice conferencing method comprising:
-
concurrently receiving a first packet voice data stream from a first conferencing endpoint and a multiple channel second packet voice data stream from a second conferencing endpoint;
mapping the voice data from the first packet voice data stream to a first set of presentation mixing channels in a manner that simulates that voice data as originating in a first sector of a presentation sound field;
mapping the voice data from the second packet voice data stream to a second set of presentation mixing channels in a manner that simulates that voice data as originating in a second sector of a presentation sound field, the second sector substantially non-overlapping the first sector; and
mixing each channel from the first set of presentation mixing channels with the corresponding channel from the second set of presentation mixing channels to form a first set of mixed channels. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An apparatus comprising a computer-readable medium containing computer instructions that, when executed, cause a processor or multiple communicating processors to perform a method for packet voice conferencing, the method comprising:
-
concurrently receiving a first packet voice data stream from a first conferencing endpoint and a multiple channel second packet voice data stream from a second conferencing endpoint;
mapping the voice data from the first packet voice data stream to a first set of presentation mixing channels in a manner that simulates that voice data as originating in a first sector of a presentation sound field;
mapping the voice data from the second packet voice data stream to a second set of presentation mixing channels in a manner that simulates that voice data as originating in a second sector of a presentation sound field, the second sector substantially non-overlapping the first sector; and
mixing each channel from the first set of presentation mixing channels with the corresponding channel from the second set of presentation mixing channels to form a first set of mixed channels. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A packet voice conferencing system comprising:
-
means for concurrently receiving multiple packet voice data streams where at least one of the multiple packet voice data streams comprises at least two channels;
means for manipulating the voice data in each of the packet voice data streams in a manner that simulates that voice data as originating in a specified sector of a presentation sound field, the sectors arranged in the sound field in substantially non-overlapping fashion; and
means for combining the manipulated voice data from each packet voice data stream into a set of presentation channels. - View Dependent Claims (25, 26, 27)
-
-
28. A packet voice conferencing system comprising:
-
first and second decoders, to respectively decode first and second packet voice data streams and produce first and second sets of one or more voice data channels from the voice data packets contained in the streams;
a packet switch to receive packet voice data streams sent to the system by first and second conferencing endpoints, at least the first conferencing endpoint comprising a multiple channel packet voice data stream, and to distribute the packet voice data stream received from the first conferencing endpoint to the first decoder and the packet voice data stream received from the second conferencing endpoint to the second decoder;
a first channel mapper to map the first set of voice data channels to a first set of presentation mixing channels in a manner that simulates the voice data as originating in a first sector of a presentation sound field;
a second channel mapper to map the second set of voice data channels to a second set of presentation mixing channels in a manner that simulates the voice data as originating in a second sector of a presentation sound field, the second sector substantially non-overlapping the first sector; and
a first set of mixers, each mixer combining one of the first set of presentation mixing channels with a corresponding one of the second set of presentation mixing channels to form a mixed channel, the set of mixers collectively forming a first set of mixed channels. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35)
-
-
36. A packet voice conferencing system comprising:
-
a decoder, to decode a multiple channel packet voice data stream to produce a set of one or more voice data subchannels from the voice data packets contained in the stream and a voice arrival direction corresponding to the set of voice data subchannels;
a controller to select one of a plurality of presentation sound field subsectors for the voice data subchannels based on the voice arrival direction, each subsector corresponding to a range of voice arrival directions; and
a channel mapper to map the set of voice data subchannels to a set of presentation channels in a manner that simulates the voice data as originating in the selected subsector of the presentation sound field. - View Dependent Claims (37, 38)
-
-
39. A packet voice conferencing system having one or more local audio capture channels, the system comprising:
-
a controller to negotiate with other packet voice conferencing systems connected in a common conference, wherein the results of a negotiation include a codec to be used by the system for encoding the local audio capture channels, and a presentation sound field sector allocated to the local audio capture channels;
a channel mapper to map the local audio capture channels to a set of presentation mixing channels in a manner that simulates the audio data on the capture channels as originating in the allocated presentation sound field sector; and
an encoder to encode the presentation mixing channels into a packet voice data stream. - View Dependent Claims (40)
-
-
41. An apparatus comprising a computer-readable medium containing computer instructions that, when executed, cause a processor or multiple communicating processors to perform a method for packet voice conferencing, the method comprising:
-
concurrently receiving a first packet voice data stream from a first conferencing endpoint and a second packet voice data stream from a second conferencing endpoint;
mapping the voice data from the first packet voice data stream to a first set of presentation mixing channels in a manner that simulates that voice data as originating in a first sector of a presentation sound field;
mapping the voice data from the second packet voice data stream to a second set of presentation mixing channels in a manner that simulates that voice data as originating in a second sector of a presentation sound field, the second sector substantially non-overlapping the first sector;
mixing each channel from the first set of presentation mixing channels with the corresponding channel from the second set of presentation mixing channels to form a first set of mixed channels;
when voice data from one of the conferencing endpoints comprises multiple voice data channels;
measuring the relative delay between at least two of the multiple channels;
estimating, from the measured relative delay, the arrival direction of a voice signal present in the voice data; and
accounting for the estimated arrival direction during mapping of the voice data into a set of presentation mixing channels.
-
-
42. An apparatus comprising a computer-readable medium containing computer instructions that, when executed, cause a processor or multiple communicating processors to perform a method for packet voice conferencing, the method comprising:
-
concurrently receiving a first packet voice data stream from a first conferencing endpoint and a second packet voice data stream from a second conferencing endpoint;
mapping the voice data from the first packet voice data stream to a first set of presentation mixing channels in a manner that simulates that voice data as originating in a first sector of a presentation sound field;
mapping the voice data from the second packet voice data stream to a second set of presentation mixing channels in a manner that simulates that voice data as originating in a second sector of a presentation sound field, the second sector substantially non-overlapping the first sector;
mixing each channel from the first set of presentation mixing channels with the corresponding channel from the second set of presentation mixing channels to form a first set of mixed channels;
pictorially displaying, on a graphical user interface, a representation of a sound field and representations of each conferencing endpoint to a listener at one conferencing endpoint, allowing that listener to manipulate the interface in order to indicate desired locations of the conferencing endpoints within the sound field, and using the listener'"'"'s manipulations to set the extent of the sectors of the presentation sound field;
wherein the graphical user interface further allows the listener to specify the number and locations of presentation channel acoustical speakers relative to that listener'"'"'s position in a room, the method further comprising accounting for the number and locations of presentation channel acoustical speakers in mapping voice data to presentation mixing channels.
-
-
43. An apparatus comprising a computer-readable medium containing computer instructions that, when executed, cause a processor or multiple communicating processors to perform a method for packet voice conferencing, the method comprising:
-
concurrently receiving a first packet voice data stream from a first conferencing endpoint and a second packet voice data stream from a second conferencing endpoint;
mapping the voice data from the first packet voice data stream to a first set of presentation mixing channels in a manner that simulates that voice data as originating in a first sector of a presentation sound field;
mapping the voice data from the second packet voice data stream to it second set of presentation mixing channels in a manner that simulates that voice data as originating in a second sector of a presentation sound field, the second sector substantially non-overlapping the first sector;
mixing each channel from the first set of presentation mixing channels with the corresponding channel from the second set of presentation mixing channels to form a first set of mixed channels;
pictorially displaying, on a graphical user interface, a representation of a sound field and representations of each conferencing endpoint to a listener at one conferencing endpoint, allowing that listener to manipulate the interface in order to indicate desired locations of the conferencing endpoints within the sound field, and using the listener'"'"'s manipulations to set the extent of the sectors of the presentation sound field;
automatically dividing the presentation sound field into sectors that allocate approximately equal shares of the presentation sound field to each endpoint;
tracking the number of conferencing endpoints participating in a conference, and automatically altering the allocation of the presentation sound field as endpoints are added to or leave the conference.
-
-
44. An apparatus comprising a computer-readable medium containing computer instructions that, when executed, cause a processor or multiple communicating processors to perform a method for packet voice conferencing, the method comprising:
-
concurrently receiving a first packet voice data stream from a first conferencing endpoint and a second packet voice data stream from a second conferencing endpoint;
mapping the voice data from the first packet voice data stream to a first set of presentation mixing channels in a manner that simulates that voice data as originating in a first sector of a presentation sound field;
mapping the voice data from the second packet voice data stream to a second set of presentation mixing channels in a manner that simulates that voice data as originating in a second sector of a presentation sound field, the second sector substantially non-overlapping the first sector;
mixing each channel from the first set of presentation mixing channels with the corresponding channel from the second set of presentation mixing channels to form a first set of mixed channels;
displaying, on a graphical user interface, a representation of a sound field and representations of each conferencing endpoint to a listener at one conferencing endpoint, allowing that listener to manipulate the interface in order to indicate desired locations of the conferencing endpoints within the sound field, and using the listener'"'"'s manipulations to set the extent of the sectors of the presentation sound field;
automatically dividing the presentation sound field into sectors that allocate approximately equal shares of the presentation sound field to each endpoint;
wherein a larger sector of the sound field is allocated to a conferencing endpoint that is broadcasting multiple capture channels than is allocated to a conferencing endpoint that is broadcasting monaurally.
-
Specification