Method and apparatus for improved matching of auditory space to visual space in video teleconferencing applications using window-based displays
First Claim
1. A method for generating a spatial rendering of a real-time audio sound from a conference participant to a remote real-time video teleconference participant in real-time using a plurality of speakers, the audio sound related a real time to video being displayed to said remote video teleconference participant on a window-based video display screen having a given physical location of said conference participant, the method comprising:
- receiving one or more real-time video input signals of said conference participant for use in displaying said real-time video to said remote video teleconference participant on said window-based video display screen, each of said received video input signals being displayed in a corresponding window on said video display screen;
receiving one or more real-time audio input signals related to said one or more video input signals, one of said audio input signals including said audio sound;
determining a desired physical location of said conference participant for spatially rendering said audio sound relative to said video display screen, the desired physical location being determined based on a position on the video display screen at which a particular one of said windows is being displayed, the particular one of said windows corresponding to the received video input signal related to the received audio input signal which includes said audio sound; and
generating a plurality of real-time audio output signals based on said determined desired physical location for spatially rendering said audio sound, said plurality of audio signals being generated such that when delivered to said remote video teleconference participant using said plurality of speakers, the remote video teleconference participant hears said audio sound as being rendered from said determined desired physical location for spatially rendering said audio sound.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for enabling an improved experience by better matching of the auditory space to the visual space in video viewing applications such as those that may be used in video teleconferencing systems using window-based displays. In particular, in accordance with certain illustrative embodiments of the present invention, one or more desired sound source locations are determined based on a location of a window in a video teleconference display device (which may, for example, comprise the image of a teleconference participant within the given window), and a plurality of audio signals which accurately locate the sound sources at the desired sound source locations (based on the location of the given window in the display) are advantageously generated.
-
Citations
24 Claims
-
1. A method for generating a spatial rendering of a real-time audio sound from a conference participant to a remote real-time video teleconference participant in real-time using a plurality of speakers, the audio sound related a real time to video being displayed to said remote video teleconference participant on a window-based video display screen having a given physical location of said conference participant, the method comprising:
-
receiving one or more real-time video input signals of said conference participant for use in displaying said real-time video to said remote video teleconference participant on said window-based video display screen, each of said received video input signals being displayed in a corresponding window on said video display screen; receiving one or more real-time audio input signals related to said one or more video input signals, one of said audio input signals including said audio sound; determining a desired physical location of said conference participant for spatially rendering said audio sound relative to said video display screen, the desired physical location being determined based on a position on the video display screen at which a particular one of said windows is being displayed, the particular one of said windows corresponding to the received video input signal related to the received audio input signal which includes said audio sound; and generating a plurality of real-time audio output signals based on said determined desired physical location for spatially rendering said audio sound, said plurality of audio signals being generated such that when delivered to said remote video teleconference participant using said plurality of speakers, the remote video teleconference participant hears said audio sound as being rendered from said determined desired physical location for spatially rendering said audio sound. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16)
-
-
13. An apparatus for generating a spatial rendering of a real-time audio sound from a conference participant to a remote video teleconference participant in real-time, the apparatus comprising:
-
a plurality of speakers; a real-time window-based video display screen having a given physical location of said conference participant, the window-based video display screen for displaying a real-time video to the video teleconference participant, the real-time audio sound being related to the video being displayed to said video teleconference participant; a video input signal receiver which receives one or more real-time video input signals of said conference participant for use in displaying said video to said remote video teleconference participant on said window-based video display screen, each of said received video input signals being displayed in a corresponding window on said video display screen; an audio input signal receiver which receives one or more real-time audio input signals related to said one or more video input signals, one of said received audio input signals including said audio sound; a processor which determines a desired physical location of said conference participant for spatially rendering said audio sound relative to said video display screen, the desired physical location being determined based on a position on the video display screen at which a particular one of said windows is being displayed, the particular one of said windows corresponding to the received video input signal related to the received audio input signal which includes said audio sound; and an audio output signal generator which generates a plurality of real-time audio output signals based on said determined desired physical location for spatially rendering said audio sound, said plurality of audio signals being generated such that when delivered to said remote video teleconference participant using said plurality of speakers, the remote video teleconference participant hears said audio sound as being rendered from said determined desired physical location for spatially rendering said audio sound. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
-
Specification