Method and apparatus for converting video to multiple markup-language presentations
First Claim
1. A method of converting a video into multiple markup language presentations for different devices and users, said method including the steps of:
- creating a video database containing shot and key frame information of said video;
generating at least one of audio, visual and textual content for presentation dependent upon display capabilities of said different devices and user specified criteria, said generating step including the sub-steps of;
if said presentation is to contain visual content, determining a heuristic measure for a desired image for display on said different devices, said heuristic measure being dependent upon either a Display Dependent Significance Measure (DDSM), or said DDSM and a user supplied significance measure;
if said presentation is to contain visual content, ranking and selecting one or more images from said video to be displayed dependent upon said heuristic measure;
if said presentation is to contain audio content, extracting an audio stream from said video;
if said presentation is to contain textual content, selecting said textual content from video annotation and/or a transcript associated with said video; and
creating multiple static and/or dynamic markup language documents dependent upon said display capabilities of said different devices and said user specified criteria for different presentations on said different devices, each document containing at least a portion of said generated audio, visual and textual content catering for a presentation on a corresponding device.
4 Assignments
0 Petitions
Accused Products
Abstract
A system is described for converting full motion video (102) into multiple markup language (HTML, WML, BHTML) presentations (140) targeted to different devices (144-152) and users. The markup language presentations (140) consist of different combinations of important visuals, audio and transcript of the full motion video (102), depending on the audio and visual/text display capability of the devices (144-152) and the user'"'"'s requirements (112, 116). The important visuals are keyframes selected (118) from the video (102) based on significance measure (114, 112) associated to different segments in the video (102). The combinations of audio, visuals and transcript can be played in a synchronous/asynchronous manner. The user can control the rate of visuals and transcript displayed.
-
Citations
21 Claims
-
1. A method of converting a video into multiple markup language presentations for different devices and users, said method including the steps of:
-
creating a video database containing shot and key frame information of said video;
generating at least one of audio, visual and textual content for presentation dependent upon display capabilities of said different devices and user specified criteria, said generating step including the sub-steps of;
if said presentation is to contain visual content, determining a heuristic measure for a desired image for display on said different devices, said heuristic measure being dependent upon either a Display Dependent Significance Measure (DDSM), or said DDSM and a user supplied significance measure;
if said presentation is to contain visual content, ranking and selecting one or more images from said video to be displayed dependent upon said heuristic measure;
if said presentation is to contain audio content, extracting an audio stream from said video;
if said presentation is to contain textual content, selecting said textual content from video annotation and/or a transcript associated with said video; and
creating multiple static and/or dynamic markup language documents dependent upon said display capabilities of said different devices and said user specified criteria for different presentations on said different devices, each document containing at least a portion of said generated audio, visual and textual content catering for a presentation on a corresponding device. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for converting a video into multiple markup language presentations for different devices and users, said apparatus including:
-
means for creating a video database containing shot and key frame information of said video;
means for generating at least one of audio, visual and textual content for presentation dependent upon display capabilities of said different devices and user specified criteria, said generating means including;
means for, if said presentation is to contain visual content, determining a heuristic measure for a desired image for display on said different devices, said heuristic measure being dependent upon either a Display Dependent Significance Measure (DDSM), or said DDSM-and a user supplied significance measure;
means for, if said presentation is to contain visual content, ranking and selecting one or more images from said video to be displayed dependent upon said heuristic measure;
means for, if said presentation is to contain audio content, extracting an audio stream from said video;
means for, if said presentation is to contain textual content, selecting said textual content from video annotation and/or a transcript associated with said video; and
means for creating multiple static and/or dynamic markup language documents dependent upon said display capabilities of said different devices and said user specified criteria for different presentations on said different devices, each document containing at least a portion of said generated audio, visual and textual content catering for a presentation on a corresponding device. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product having a computer readable medium having a computer program recorded therein for converting a video into multiple markup language presentations for different devices and users, said computer program product including:
-
means for creating a video database containing shot and key frame information of said video;
means for generating at least one of audio, visual and textual content for presentation dependent upon display capabilities of said different devices and user specified criteria, said generating means including;
means for, if said presentation is to contain visual content, determining a heuristic measure for a desired image for display on said different devices, said heuristic measure being dependent upon either a Display Dependent Significance Measure (DDSM), or said DDSM and a user supplied significance measure;
means for, if said presentation is to contain visual content, ranking and selecting one or more images from said video to be displayed dependent upon said heuristic measure;
means for, if said presentation is to contain audio content, extracting an audio stream from said video;
means for, if said presentation is to contain textual content, selecting said textual content from video annotation and/or a transcript associated with said video; and
means for creating multiple static and/or dynamic markup language documents dependent upon said display capabilities of said different devices and said user specified criteria for different presentations on said different devices, each document containing at least a portion of said generated audio, visual and textual content catering for a presentation on a corresponding device. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification