Lip synchronization for audio/video transmissions over a network
First Claim
1. A method comprising:
- receiving, by a video mixer, information from an audio mixer that includes delay values respecting input audio streams received by the audio mixer from a plurality of source endpoints, and output audio streams sent without timestamps from the audio mixer to a plurality of destination endpoints that do not extract Real-Time Transport Control Protocol (RTCP) packets, each input and output audio stream being respectively associated with a corresponding input and output video stream, the video mixer and the audio mixer being physically located at different places on a communication network;
calculating, from the information received, an output delay value for each corresponding output video stream to ensure that source endpoint-to-destination endpoint audio and video delays are substantially equal;
buffering, by the video mixer, each of the corresponding input and output video streams; and
adjusting the buffering of each of the corresponding input and output video streams so as to delay each corresponding output video stream by the output delay value.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a system includes a video mixer coupled with an audio mixer for exchange of information that includes a first set of delay values respecting input audio streams received by the audio mixer from a plurality of source endpoints, and output audio streams sent from the audio mixer to a plurality of destination endpoints. The information further including a second set of delay values respecting the corresponding input video streams. The audio mixer calculates end-to-end video delays, and the video mixer calculates end-to-end audio delays. The audio mixer delays the output audio streams to equalize the end-to-end audio and video delays in the event that the end-to-end audio delays are less than the end-to-end video delays, and the video mixer delays the output video streams to equalize the end-to-end audio and video delays in the event that the end-to-end video delays are less than the end-to-end audio delays.
91 Citations
9 Claims
-
1. A method comprising:
-
receiving, by a video mixer, information from an audio mixer that includes delay values respecting input audio streams received by the audio mixer from a plurality of source endpoints, and output audio streams sent without timestamps from the audio mixer to a plurality of destination endpoints that do not extract Real-Time Transport Control Protocol (RTCP) packets, each input and output audio stream being respectively associated with a corresponding input and output video stream, the video mixer and the audio mixer being physically located at different places on a communication network; calculating, from the information received, an output delay value for each corresponding output video stream to ensure that source endpoint-to-destination endpoint audio and video delays are substantially equal; buffering, by the video mixer, each of the corresponding input and output video streams; and adjusting the buffering of each of the corresponding input and output video streams so as to delay each corresponding output video stream by the output delay value. - View Dependent Claims (2, 3, 4)
-
-
5. A method comprising:
-
receiving, by an audio mixer, information from an video mixer that includes delay values respecting input video streams received by the video mixer from a plurality of source endpoints, and output video streams sent without timestamps from the video mixer to a plurality of destination endpoints that do not extract Real-Time Transport Control Protocol (RTCP) packets, each input and output video stream being respectively associated with a corresponding input and output audio stream, the video mixer and the audio mixer being physically located at different places on a communication network, the delay values including, for each of the input video streams, a first delay from a source endpoint to a summation unit of the video mixer, and for each of the output video streams, a second delay from the summation unit to a destination endpoint; calculating, from the information received, an output delay value for each corresponding output audio stream to ensure that source endpoint-to-destination endpoint audio end video delays are substantially equal; buffering, by the audio mixer, each of the corresponding input and output audio streams; adjusting the buffering of each of the corresponding input and output audio streams so as to delay each corresponding output audio stream by the output delay value. - View Dependent Claims (6, 7)
-
-
8. A computer readable memory encoded with a computer program product, when executed the computer program product being operable to:
-
during a video conference session, receive information from an audio mixer on a communication network, the information including delay values respecting input audio streams received by the audio mixer from a plurality of source endpoints, and output audio streams sent without timestamps from the audio mixer to a plurality of destination endpoints that do not extract Real-Time Transport Control Protocol (RTCP) packets, each input and output audio stream being respectively associated with a corresponding input and output video stream, the delay values including, for each of the input audio streams, a first delay from a source endpoint to a summation unit of the audio mixer, and for each of the output audio streams, a second delay from the summation unit to a destination endpoint; calculate, from the information received, an output delay value for each corresponding output video stream to ensure that source endpoint-to-destination endpoint audio and video delays are substantially equal; buffer, by a video mixer, each of the corresponding input and output video streams; and adjust the buffering so as to delay each corresponding output video stream by the output delay value.
-
-
9. A computer readable memory encoded with a computer program product, when executed the computer program product being operable to:
-
during a video conference session, receive information from a video mixer physically located at a different place on a communication network, the information including delay values respecting input video streams received by the video mixer from a plurality of source endpoints, and output video streams sent without timestamps from the video mixer to a plurality of destination endpoints that do not extract Real-Time Transport Control Protocol (RTCP) packets, each input and output video stream being respectively associated with a corresponding input and output audio stream, the delay values including, for each of the input video streams, a first delay from a source endpoint to a switch of the video mixer, and for each of the output video streams, a second delay from the switch to a destination endpoint; calculate, from the information received, an output delay value for each corresponding output audio stream to ensure that source endpoint-to-destination endpoint audio and video delays are substantially equal; buffer, by an audio mixer, each of the corresponding input and output audio streams; and adjust the buffering so as to delay each corresponding output audio stream by the output delay value.
-
Specification