Optimized video snapshot
First Claim
Patent Images
1. A method for presenting an aesthetic image, said method comprising:
- receiving at an audio analysis tool, a set of audio and video streams for at least one of a plurality of users in a video conference, said audio and video streams being synchronized with each other;
analyzing said audio track of said one of a plurality of users in said video conference to determine when said user is an active speaker;
when said user is an active speaker, analyzing a speech signal of the audio track to identify aesthetic phonemes of said active speaker, wherein the aesthetic phonemes comprise phonemes that, when spoken by said active speaker, produce aesthetically pleasing face expressions; and
extracting an optimal image from said video stream of said active speaker corresponding to one of said aesthetic phonemes, said optimal image comprising a frame from said video stream.
18 Assignments
0 Petitions
Accused Products
Abstract
Methods, media and devices for generating an optimized image snapshot from a captured sequence of persons participating in a meeting are provided. In some embodiments, methods media and devices for utilizing a captured image as a representative image of a person as a replacement of a video stream; as a representation of a person in offline archiving systems; or as a representation of a person in a system participant roster.
5 Citations
16 Claims
-
1. A method for presenting an aesthetic image, said method comprising:
-
receiving at an audio analysis tool, a set of audio and video streams for at least one of a plurality of users in a video conference, said audio and video streams being synchronized with each other; analyzing said audio track of said one of a plurality of users in said video conference to determine when said user is an active speaker; when said user is an active speaker, analyzing a speech signal of the audio track to identify aesthetic phonemes of said active speaker, wherein the aesthetic phonemes comprise phonemes that, when spoken by said active speaker, produce aesthetically pleasing face expressions; and extracting an optimal image from said video stream of said active speaker corresponding to one of said aesthetic phonemes, said optimal image comprising a frame from said video stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for presenting an aesthetic image, said system comprising:
an audio analysis tool enabled to receive a set of audio and video streams for at least one of a plurality of user in a video conference, said audio and video streams being synchronized with each other, analyze said audio track of said one of a plurality of users in said video conference to determine when said one of a plurality of users is an active speaker, analyze a speech signal of the audio track to identify aesthetic phonemes of said active speaker, wherein the aesthetic phonemes comprise phonemes that, when spoken by said active speaker, produce aesthetically pleasing face expressions, and extract an optimal image from said video stream of said active speaker corresponding to one of said aesthetic phonemes, said optimal image comprising a frame from said video stream. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
Specification