Synchronization and mixing of audio and video streams in network based video conferencing call systems

US 8,700,195 B2
Filed: 10/05/2012
Issued: 04/15/2014
Est. Priority Date: 09/30/2007
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for producing a video conference display for a receiver participant, the method comprising:

receiving over a network two or more audio streams and two or more video streams from two or more sender participants, the audio streams divided into audio chunks and the video stream(s) divided into video frames, each audio chunk including an audio time marker from its sender participant indicating a start of the audio chunk and each video frame including a video time marker from its sender participant indicating a start of the video frame, the time markers from different sender participants independent of one another;

generating and playing a composite audio stream of the received audio streams, said generating and playing comprising repeating the steps of;

opening a mix;

adding-audio chunks from the two or more sender participants to the mix, each chunk retaining an audio time marker from its sender participant;

closing the mix if either audio chunks from all sender participants are in the mix or if a predetermined early close condition is met; and

playing the combined audio chunks in the mix in an order of receipt of the audio chunks; and

repeatedly determining, independently for each sender participant, if a current video frame of the video stream should occur during the playing of an audio chunk from the sender participant in a current mix, the determining comprising;

identifying the audio time marker of the audio chunk in the mix from the sender participant;

calculating a time tolerance for the audio time marker;

comparing a video time marker for the current video frame to the time tolerance of the audio time marker;

if the video time marker for the current video frame is within the time tolerance for the audio time marker, then displaying the current video frame and moving to a next video frame;

if the current video frame should occur after the time marker for the audio time marker, then waiting; and

if the current video frame should have occurred before the time marker for the audio time marker, then discarding the current video frame and moving to a next video frame.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In one aspect, audio streams are added to a mix until the mix is either complete (i.e., all audio streams have been added) or the mix is closed early (i.e., before the mix is complete). In another aspect, audio and video streams are synchronized by playing back the audio stream and then synchronizing display of the video frames to the playback of the audio stream.

56 Citations

View as Search Results

8 Claims

1. A computer-implemented method for producing a video conference display for a receiver participant, the method comprising:
- receiving over a network two or more audio streams and two or more video streams from two or more sender participants, the audio streams divided into audio chunks and the video stream(s) divided into video frames, each audio chunk including an audio time marker from its sender participant indicating a start of the audio chunk and each video frame including a video time marker from its sender participant indicating a start of the video frame, the time markers from different sender participants independent of one another;
  
  generating and playing a composite audio stream of the received audio streams, said generating and playing comprising repeating the steps of;
  
  opening a mix;
  
  adding-audio chunks from the two or more sender participants to the mix, each chunk retaining an audio time marker from its sender participant;
  
  closing the mix if either audio chunks from all sender participants are in the mix or if a predetermined early close condition is met; and
  
  playing the combined audio chunks in the mix in an order of receipt of the audio chunks; and
  
  repeatedly determining, independently for each sender participant, if a current video frame of the video stream should occur during the playing of an audio chunk from the sender participant in a current mix, the determining comprising;
  
  identifying the audio time marker of the audio chunk in the mix from the sender participant;
  
  calculating a time tolerance for the audio time marker;
  
  comparing a video time marker for the current video frame to the time tolerance of the audio time marker;
  
  if the video time marker for the current video frame is within the time tolerance for the audio time marker, then displaying the current video frame and moving to a next video frame;
  
  if the current video frame should occur after the time marker for the audio time marker, then waiting; and
  
  if the current video frame should have occurred before the time marker for the audio time marker, then discarding the current video frame and moving to a next video frame.
- View Dependent Claims (2, 3, 4)
- - 2. The computer-implemented method of claim 1 wherein the step of generating and playing a composite audio stream further comprises:
    - buffering the received audio chunks;
      
      wherein the step of adding audio chunks to the mix comprises cycling through the sender participants, and for each sender participant on each cycle, if the sender participant is not yet in the mix, adding the sender participant'"'"'s audio chunk to the mix if the missing audio chunk is available from the buffer, the audio chunk retaining its audio time marker.
  - 3. The computer-implemented method of claim 1 wherein the step of generating and playing a composite audio stream further comprises:
    - as each sender participant'"'"'s audio chunk is received, if the sender participant is not yet in the mix and the received audio chunk is the correct audio chunk for the mix, adding the sender participant'"'"'s audio chunk to the mix, the audio chunk retaining its audio time marker; and
      
      , otherwise, buffering the sender participant'"'"'s audio chunk for a future mix.
  - 4. The computer-implemented method of claim 1, wherein each the time marker includes the time tolerance, the time tolerance comprising a nominal start time and a nominal end time for the audio chunk, the nominal start time adjusted by a first tolerance to be lower, and the nominal end time adjusted by a second tolerance to be higher.

5. A computer program product stored on a non-transitory computer-readable medium that includes instructions that, when loaded into memory, cause a processor to perform a method, the method comprising:
- receiving over a network two or more audio streams and two or more video streams from two or more sender participants, the audio streams divided into audio chunks and the video stream(s) divided into video frames, each audio chunk including an audio time marker from its sender participant indicating a start of the audio chunk and each video frame including a video time marker from its sender participant indicating a start of the video frame, the time markers from different sender participants independent of one another;
  
  generating and playing a composite audio stream of the received audio streams, said generating and playing comprising repeating the steps of;
  
  opening a mix;
  
  adding-audio chunks from the two or more sender participants to the mix, each chunk retaining an audio time marker from its sender participant;
  
  closing the mix if either audio chunks from all sender participants are in the mix or if a predetermined early close condition is met; and
  
  playing the combined audio chunks in the mix in an order of receipt of the audio chunks; and
  
  repeatedly determining, independently for each sender participant, if a current video frame of the video stream should occur during the playing of an audio chunk from the sender participant in a current mix, the determining comprising;
  
  identifying the audio time marker of the audio chunk in the mix from the sender participant;
  
  calculating a time tolerance for the audio time marker;
  
  comparing a video time marker for the current video frame to the time tolerance of the audio time marker;
  
  if the video time marker for the current video frame is within the time tolerance for the audio time marker, then displaying the current video frame and moving to a next video frame;
  
  if the current video frame should occur after the time marker for the audio time marker, then waiting; and
  
  if the current video frame should have occurred before the time marker for the audio time marker, then discarding the current video frame and moving to a next video frame.
- View Dependent Claims (6, 7, 8)
- - 6. The computer program product of claim 5 wherein the step of generating and playing a composite audio stream further comprises:
    - buffering the received audio chunks;
      
      wherein the step of adding audio chunks to the mix comprises cycling through the sender participants, and for each sender participant on each cycle, if the sender participant is not yet in the mix, adding the sender participant'"'"'s audio chunk to the mix if the missing audio chunk is available from the buffer, the audio chunk retaining its audio time marker.
  - 7. The computer program product of claim 5 wherein the step of generating and playing a composite audio stream further comprises:
    - as each sender participant'"'"'s audio chunk is received, if the sender participant is not yet in the mix and the received audio chunk is the correct audio chunk for the mix, adding the sender participant'"'"'s audio chunk to the mix, the audio chunk retaining its audio time marker; and
      
      , otherwise, buffering the sender participant'"'"'s audio chunk for a future mix.
  - 8. The computer program product of claim 5, wherein each the time marker includes the time tolerance, the time tolerance comprising a nominal start time and a nominal end time for the audio chunk, the nominal start time adjusted by a first tolerance to be lower, and the nominal end time adjusted by a second tolerance to be higher.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Red Hat, Inc. (International Business Machines Corporation)
Original Assignee
Optical Fusion Inc.
Inventors
Thapa, Mukund N.
Primary Examiner(s)
MCCORD, PAUL C

Application Number

US13/646,395
Publication Number

US 20130027507A1
Time in Patent Office

557 Days
Field of Search

700/94
US Class Current

700/94
CPC Class Codes

H04L 65/1089   by adding media; by removin...

H04L 65/1093   by adding participants; by ...

H04L 65/403   Arrangements for multi-part...

H04L 65/4038   with floor control

H04L 65/4046   with distributed floor control

H04L 65/764   at the destination reforma...

H04M 3/562   where the conference facili...

H04M 3/564   whereby the feature is a su...

H04M 3/568   audio processing specific t...

H04N 5/04   Synchronising for televisio...

H04N 7/15   Conference systems

H04N 7/155   involving storage of or acc...

Synchronization and mixing of audio and video streams in network based video conferencing call systems

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

56 Citations

8 Claims

Specification

Use Cases

Quick Links

Others

Synchronization and mixing of audio and video streams in network based video conferencing call systems

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

56 Citations

8 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others