Method and apparatus for inserting variable audio delay to minimize latency in video conferencing
First Claim
1. A method for adjusting video and audio synchronization in a video conference with a near end and at least one far end, the method comprising the steps of:
- an interactivity detector indicating a mode of conference;
wherein an interactive mode of conference is detected when an amplitude of an audio signal from the far end exceeds a first threshold, andwherein a non-interactive mode of conference is detected when the amplitude of the audio signal remains above the first threshold for a period of time;
replaying the audio with a minimum delay if the mode of conference is interactive;
buffering the audio in a buffer and replaying the audio with a sync delay if the mode of conference is not interactive;
measuring a background noise level at the far end; and
adjusting the first threshold based on the background noise level.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus to insert variable audio delay during video conferencing to achieve conflicting goals of lip-sync and interactive conversation. The amount of audio delay is variable according to the condition of the videoconferencing: long audio delay is inserted to achieve lip-sync between audio and video playback during monologue speech, but minimum or no audio delay is inserted during interactive discussion or argument. Variable audio playback speeds may be used instead of inserting quantum delay to achieve the same result. Various methods and apparatuses to detect the non-interactive mode or interactive modes are also disclosed.
-
Citations
60 Claims
-
1. A method for adjusting video and audio synchronization in a video conference with a near end and at least one far end, the method comprising the steps of:
-
an interactivity detector indicating a mode of conference; wherein an interactive mode of conference is detected when an amplitude of an audio signal from the far end exceeds a first threshold, and wherein a non-interactive mode of conference is detected when the amplitude of the audio signal remains above the first threshold for a period of time; replaying the audio with a minimum delay if the mode of conference is interactive; buffering the audio in a buffer and replaying the audio with a sync delay if the mode of conference is not interactive; measuring a background noise level at the far end; and adjusting the first threshold based on the background noise level. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method for adjusting video and audio synchronization in a video conference with a near end and at least one far end, the method comprising the acts of:
-
an interactivity detector indicating a mode of conference; wherein when an amplitude of an audio signal from the far end increases and exceeds a first threshold, the interactivity detector indicates interactive mode; and
when the amplitude remains above the first threshold for a first period of time, the interactivity detector indicates non-interactive mode; andwherein when the amplitude of the audio signal from the far end drops from above the first threshold to below a second threshold, the interactivity detector indicates interactive mode; and
when the amplitude remains below the second threshold for a second period of time, the interactivity detector indicates non-interactive mode.replaying the audio with a minimum delay if the mode of the conference is interactive; buffering the audio in a buffer and replaying the audio with a sync delay if the mode of conference is not interactive; measuring a background noise level at the far end; and adjusting the first threshold and the second threshold based on the background noise level.
-
-
21. A device at a near end for video conferencing with a video conference device at a far end, the device at the near end comprising:
-
a network interface for coupling with the conference device at the far end; a controller unit coupled to the network, the audio speaker and the video display for controlling audio and video replay through the speaker and the video display; a buffer for storing audio, video and control data; wherein the controller unit is operative to replay the video with a sync delay; wherein the controller unit is operative to replay the audio with a minimum delay if the mode of the conference is interactive; and wherein the controller unit is operative to receive the audio from far end, store the audio in the buffer and replay the audio with the sync delay if the mode of conference is non-interactive. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
-
40. A device at a near end for video conferencing with a video conference device at a far end, the device at the near end comprising:
-
a network interface for coupling with the conference device at the far end; a controller unit coupled to the network, the audio speaker and the video display for controlling audio and video replay through the speaker and the video display; a buffer for storing audio, video and control data; wherein the controller unit is operative to replay the video with a sync delay; wherein the controller unit is operative to replay the audio with a minimum delay if the mode of the conference is interactive; wherein the controller unit is operative to receive the audio from far end, store the audio in the buffer and replay the audio with the sync delay if the mode of conference is non-interactive; wherein the controller comprises an interactivity detector; wherein the interactivity detector is operative to indicate interactive mode when the amplitude of the audio signal from the far end increases and exceeds a first threshold, and to indicate non-interactive mode when the amplitude remains above the first threshold for a first period of time; wherein the interactivity detector is operative to indicate interactive mode when the amplitude of the audio signal from the far end drops from above the first threshold to below a second threshold, and to indicate non-interactive mode when the amplitude remains below the second threshold for a second period of time; and wherein the controller is operative to derive a background noise level at the near end and transmit a control signal indicating the background noise level.
-
-
41. A video conference system having a near end and at least one far end, the system comprising:
-
a microphone, a video camera, a speaker and a video display at the near end; a network interface at the near end for coupling with the conference system at the far end; a controller unit at the near end coupled to the network, the microphone, the video camera, the audio speaker and the video display for controlling audio and video generation and replay; a buffer at the near end for storing audio, video and control data; and a far end system; wherein the controller unit is operative to replay the video with a sync delay; wherein the controller unit is operative to replay the audio with a minimum delay if the mode of the conference at near end is interactive; and wherein the controller unit is operative to receive the audio from far end, store the audio in the buffer and replay the audio with the sync delay if the mode of conference at near end is non-interactive. - View Dependent Claims (42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59)
-
-
60. A method for controlling synchronization of video and audio in a video conference between a near end and at least one far end, the method comprising the acts of:
-
detecting a conference mode, wherein an interactive mode is detected when an amplitude of an audio signal from the far end exceeds a threshold, and wherein a non-interactive mode is detected when the amplitude of the audio signal remains above the threshold for a period of time; replaying the audio with a minimum delay if the mode is interactive; buffering the audio in a buffer and replaying the audio with a sync delay if the mode is not interactive; measuring a speech level at the far end; and adjusting the threshold based on the speech level.
-
Specification