System and process for adding high frame-rate current speaker data to a low frame-rate video using delta frames
First Claim
1. A computer-implemented process for highlighting a current speaker in each frame of a low frame-rate video at a rate significantly faster than the video frame rate, comprising using a computer to perform the following process actions:
- obtaining audio and video of an event having multiple people in attendance;
tracking the movements of the attendees and recording their positions when each frame of the video is obtained and their subsequent positions until the next video frame is obtained;
periodically identifying which of the attendees is currently speaking at a rate significantly faster than the prescribed video frame rate;
generating a data stream of video frames from the obtained video of the event comprising, keyframes generated at a prescribed frame rate, and delta frames, one or more of which are generated between the generation of each pair of consecutive keyframes, wherein each delta frame comprises just those changes needed to the last-generated keyframe as modified if all previously generated delta frames, if any, applicable to that keyframe were applied thereto, which highlight a region in that keyframe associated with the location of a current speaker as depicted in the last-generated keyframe in a way that visually distinguishes that attendee from all other currently non-speaking attendees also depicted in the last-generated keyframe; and
generating an audio data stream from the obtained audio of the event.
3 Assignments
0 Petitions
Accused Products
Abstract
A system and process for highlighting the current speaker on an on-going basis in each frame of a low frame-rate video of an event having multiple people in attendance is presented. In general, this is accomplished by periodically identifying an attendee that is currently speaking at a rate substantially faster than the video frame rate, and updating each frame of the video to highlight the current speaker. More particularly, an A/V source provides a video stream to a client computing device that includes delta frames interspersed between the frames of the low frame-rate video. The full video frames act as keyframes, and the delta frames provide the changes needed to modify the last displayed version of the last keyframe to highlight just the region associated with the location of a current speaker. This allows the client device to operate as a standard A/V rendering and display unit.
-
Citations
20 Claims
-
1. A computer-implemented process for highlighting a current speaker in each frame of a low frame-rate video at a rate significantly faster than the video frame rate, comprising using a computer to perform the following process actions:
-
obtaining audio and video of an event having multiple people in attendance;
tracking the movements of the attendees and recording their positions when each frame of the video is obtained and their subsequent positions until the next video frame is obtained;
periodically identifying which of the attendees is currently speaking at a rate significantly faster than the prescribed video frame rate;
generating a data stream of video frames from the obtained video of the event comprising, keyframes generated at a prescribed frame rate, and delta frames, one or more of which are generated between the generation of each pair of consecutive keyframes, wherein each delta frame comprises just those changes needed to the last-generated keyframe as modified if all previously generated delta frames, if any, applicable to that keyframe were applied thereto, which highlight a region in that keyframe associated with the location of a current speaker as depicted in the last-generated keyframe in a way that visually distinguishes that attendee from all other currently non-speaking attendees also depicted in the last-generated keyframe; and
generating an audio data stream from the obtained audio of the event. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for highlighting a current speaker in each frame of a low frame-rate video at a rate significantly faster than the video frame rate, comprising:
-
a general purpose computing device;
at least one video camera;
at least one microphone; and
a computer program comprising program modules executable by the computing device, comprising, a video stream creation module which generates a stream of keyframes at a prescribed frame rate using a video signal output from each video camera, an audio stream creation module which generates a continuous stream of audio data using an audio signal output from each microphone;
a current speaker detection module which, periodically identifies the current speaker among the persons depicted in each keyframe of the video stream at a rate substantially faster than the keyframe generation rate, and tracks the movements of the persons depicted in each keyframe between the generation of said keyframes so as to equate their current location with their original location when the keyframe was generated;
a delta frame generation module which generates one or more delta frames between the generation of each pair of consecutive keyframes, wherein each delta frame comprises just those changes needed to the last-generated keyframe as it would appear if all previously generated delta frames, if any, applicable to that keyframe were applied thereto, which highlight a region in that keyframe associated with the location of a current speaker as depicted in the last-generated keyframe in a way that visually distinguishes that attendee from all other currently non-speaking attendees also depicted in the last-generated keyframe. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computer-implemented process for highlighting the current speaker in each frame of a low frame-rate video of an event having multiple people in attendance, comprising using a computer to perform the following process actions:
-
obtaining the low frame-rate video of the event which comprises, keyframes generated at a prescribed frame rate, and delta frames, one or more of which are generated between the generation of each pair of consecutive keyframes, wherein each delta frame comprises just those changes needed to the last-generated keyframe as modified if all previously generated delta frames, if any, applicable to that keyframe were applied thereto, which highlight a region in that keyframe associated with the location of a current speaker as depicted in the last-generated keyframe in a way that visually distinguishes that attendee from all other currently non-speaking attendees also depicted in the last-generated keyframe;
obtaining a continuous audio stream of the event;
synchronizing the audio and video streams; and
rendering and displaying the video while playing the audio.
-
-
18. A system for highlighting the current speaker in each frame of a low frame-rate video stream of an event having multiple people in attendance, comprising:
-
a general purpose computing device;
a computer program comprising program modules executable by the computing device, comprising, a video input module which obtains the low frame-rate video stream, said video stream comprising, keyframes generated at a prescribed frame rate, and delta frames, one or more of which are generated between the generation of each pair of consecutive keyframes, wherein each delta frame comprises just those changes needed to the last-generated keyframe as modified if all previously generated delta frames, if any, applicable to that keyframe were applied thereto, which highlight a region in that keyframe associated with the location of a current speaker as depicted in the last-generated keyframe in a way that visually distinguishes that attendee from all other currently non-speaking attendees also depicted in the last-generated keyframe, an audio input module which obtains a continuous audio stream of the event, a synchronizer module which synchronizing the audio and video streams, and a rendering and display module which renders and displays the video while playing the audio. - View Dependent Claims (19, 20)
-
Specification