Synchronization and mixing of audio and video streams in network based video conferencing call systems
First Claim
Patent Images
1. A computer-implemented method for producing a video conference display for a receiver participant, the method comprising:
- receiving over a network two or more audio streams and two or more video streams from two or more sender participants, the audio streams divided into audio chunks and the video stream(s) divided into video frames, each audio chunk including an audio time marker from its sender participant indicating a start of the audio chunk and each video frame including a video time marker from its sender participant indicating a start of the video frame, the time markers from different sender participants independent of one another;
generating and playing a composite audio stream of the received audio streams, said generating and playing comprising repeating the steps of;
opening a mix;
adding-audio chunks from the two or more sender participants to the mix, each chunk retaining an audio time marker from its sender participant;
closing the mix if either audio chunks from all sender participants are in the mix or if a predetermined early close condition is met; and
playing the combined audio chunks in the mix in an order of receipt of the audio chunks; and
repeatedly determining, independently for each sender participant, if a current video frame of the video stream should occur during the playing of an audio chunk from the sender participant in a current mix, the determining comprising;
identifying the audio time marker of the audio chunk in the mix from the sender participant;
calculating a time tolerance for the audio time marker;
comparing a video time marker for the current video frame to the time tolerance of the audio time marker;
if the video time marker for the current video frame is within the time tolerance for the audio time marker, then displaying the current video frame and moving to a next video frame;
if the current video frame should occur after the time marker for the audio time marker, then waiting; and
if the current video frame should have occurred before the time marker for the audio time marker, then discarding the current video frame and moving to a next video frame.
1 Assignment
0 Petitions
Accused Products
Abstract
In one aspect, audio streams are added to a mix until the mix is either complete (i.e., all audio streams have been added) or the mix is closed early (i.e., before the mix is complete). In another aspect, audio and video streams are synchronized by playing back the audio stream and then synchronizing display of the video frames to the playback of the audio stream.
56 Citations
8 Claims
-
1. A computer-implemented method for producing a video conference display for a receiver participant, the method comprising:
-
receiving over a network two or more audio streams and two or more video streams from two or more sender participants, the audio streams divided into audio chunks and the video stream(s) divided into video frames, each audio chunk including an audio time marker from its sender participant indicating a start of the audio chunk and each video frame including a video time marker from its sender participant indicating a start of the video frame, the time markers from different sender participants independent of one another; generating and playing a composite audio stream of the received audio streams, said generating and playing comprising repeating the steps of; opening a mix; adding-audio chunks from the two or more sender participants to the mix, each chunk retaining an audio time marker from its sender participant; closing the mix if either audio chunks from all sender participants are in the mix or if a predetermined early close condition is met; and playing the combined audio chunks in the mix in an order of receipt of the audio chunks; and repeatedly determining, independently for each sender participant, if a current video frame of the video stream should occur during the playing of an audio chunk from the sender participant in a current mix, the determining comprising; identifying the audio time marker of the audio chunk in the mix from the sender participant; calculating a time tolerance for the audio time marker; comparing a video time marker for the current video frame to the time tolerance of the audio time marker; if the video time marker for the current video frame is within the time tolerance for the audio time marker, then displaying the current video frame and moving to a next video frame; if the current video frame should occur after the time marker for the audio time marker, then waiting; and if the current video frame should have occurred before the time marker for the audio time marker, then discarding the current video frame and moving to a next video frame. - View Dependent Claims (2, 3, 4)
-
-
5. A computer program product stored on a non-transitory computer-readable medium that includes instructions that, when loaded into memory, cause a processor to perform a method, the method comprising:
-
receiving over a network two or more audio streams and two or more video streams from two or more sender participants, the audio streams divided into audio chunks and the video stream(s) divided into video frames, each audio chunk including an audio time marker from its sender participant indicating a start of the audio chunk and each video frame including a video time marker from its sender participant indicating a start of the video frame, the time markers from different sender participants independent of one another; generating and playing a composite audio stream of the received audio streams, said generating and playing comprising repeating the steps of; opening a mix; adding-audio chunks from the two or more sender participants to the mix, each chunk retaining an audio time marker from its sender participant; closing the mix if either audio chunks from all sender participants are in the mix or if a predetermined early close condition is met; and playing the combined audio chunks in the mix in an order of receipt of the audio chunks; and repeatedly determining, independently for each sender participant, if a current video frame of the video stream should occur during the playing of an audio chunk from the sender participant in a current mix, the determining comprising; identifying the audio time marker of the audio chunk in the mix from the sender participant; calculating a time tolerance for the audio time marker; comparing a video time marker for the current video frame to the time tolerance of the audio time marker; if the video time marker for the current video frame is within the time tolerance for the audio time marker, then displaying the current video frame and moving to a next video frame; if the current video frame should occur after the time marker for the audio time marker, then waiting; and if the current video frame should have occurred before the time marker for the audio time marker, then discarding the current video frame and moving to a next video frame. - View Dependent Claims (6, 7, 8)
-
Specification