Spatial sound conference system and method
3 Assignments
0 Petitions
Accused Products
Abstract
The spatial sound conference system enables participants in a teleconference to distinguish between speakers even during periods of interruption and overtalk, identify speakers based on spatial location cues, understand low volume speech, and block out background noise using spatial sound information. Spatial sound information may be captured using microphones positioned at the ear locations of a dummy head at a conference table, or spatial sound information may be added to a participant'"'"'s monaural audio signal using head-related transfer functions. Head-related transfer functions simulate the frequency response of audio signals across the head from one ear to the other ear to create a spatial location for a sound. Spatial sound is transmitted across a communication channel, such as ISDN, and reproduced using spatially disposed loudspeakers positioned at the ears of a participant. By inserting a spatial sound component in a teleconference, a speaker other than the loudest speaker may be heard during periods of interruption and overtalk. Additionally, speakers may be more readily identified when they have a spatial sound position, and the perception of background noise is reduced.
66 Citations
47 Claims
-
1-27. -27. (canceled)
-
28. A system comprising:
-
a plurality of participant stations, each of the plurality of participant stations associated with at least one conference participant and including at least one microphone configured to transmit a participant audio signal generated based on the at least one conference participant, at least one speaker configured to receive a composite audio signal and convert the composite audio signal to audible sound, at least one video camera configured to transmit a participant video signal generated based on the at least one conference participant, at least one video display configured to receive a transmitted video signal, and a station processing system coupled to the at least one microphone, the at least one speaker, the at least one video camera and the at least one video display, the station processing system configured to receive the participant audio signal from the at least one microphone, receive the participant video signal from the at least one video camera, compress the participant audio signal and the participant video signal, transmit the compressed participant audio signal and compressed participant video signal over a network, receive the composite audio signal and the transmitted video signal in compressed form, decompress the composite audio signal and transmitted video signal from the compressed form, transmit the composite audio signal to the at least one speaker, transmit the transmitted video signal to the at least one display; and
a spatial processing system coupled to the plurality of participant stations via the network, the spatial processing system configured to receive the participant audio signal from each participant station, receive the participant video signal from each participant station, decompress the participant audio signals, apply a first head-related transfer function associated with a first participant station of the plurality of participant stations to the participant audio signal of the first participant station to generate a first spatialized audio signal, apply a second head-related transfer function associated with a second participant station of the plurality of participant stations to the participant audio signal of the second participant station to generate a second spatialized audio signal, combine the first spatialized audio signal and second spatialized audio signal into a third composite audio signal, compress the third composite audio signal, and transmit the third composite audio signal to a third participant station of the plurality of participant stations. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 46, 47)
-
-
38. A method comprising:
-
receiving participant audio signals from each of a plurality of participant stations;
receiving participant video signals from each of the plurality of participant stations;
decompressing the participant audio signals;
applying a first head-related transfer function associated with a first participant station of the plurality of participant stations to the participant audio signal of the first participant station to generate a first spatialized audio signal;
applying a second head-related transfer function associated with a second participant station of the plurality of participant stations to the participant audio signal of the second participant station to generate a second spatialized audio signal;
combining the first spatialized audio signal and second spatialized audio signal into a third composite audio signal;
compressing the third composite audio signal; and
transmitting the third composite audio signal to a third participant station of the plurality of participant stations. - View Dependent Claims (39, 40, 41, 42, 43, 44, 45)
-
Specification