System and method for annotating multi-modal characteristics in multimedia documents
First Claim
Patent Images
1. An apparatus for managing multimedia content, said apparatus comprising:
- a processor;
an arrangement for supplying multimedia content;
an input interface for permitting a selection, for observation, of at least one of the following modes associated with the multimedia content;
an audio portion that includes video; and
a video portion that includes audio; and
an arrangement for labeling observations of a selected mode;
wherein said arrangement for labeling observations of a selected mode comprises;
an arrangement for assigning semantic, multimedia content-based labels to segments of said observations of a selected mode;
wherein said arrangement for assigning semantic, multimedia content-based labels is configured to;
provide a label from a predefined set of multimedia content descriptors; and
assign a new label not present in said predefined set of multimedia content descriptors; and
an arrangement permitting a user to label audio while viewing video having;
a check box for labeling foreground sounds;
a check box for labeling background sounds; and
a keyword text box arrangement; and
an arrangement for storing said semantic, multimedia content-based labels with the multimedia content.
2 Assignments
0 Petitions
Accused Products
Abstract
A manual annotation system of multi-modal characteristics in multimedia files. There is provided an arrangement for selection an observation modality of video with audio, video without audio, audio with video, or audio without video, to be used to annotate multimedia content. While annotating video or audio features is isolation results in less confidence in the identification of features, observing both audio and video simultaneously and annotating that observation results in a higher confidence level.
-
Citations
10 Claims
-
1. An apparatus for managing multimedia content, said apparatus comprising:
-
a processor; an arrangement for supplying multimedia content; an input interface for permitting a selection, for observation, of at least one of the following modes associated with the multimedia content;
an audio portion that includes video; and
a video portion that includes audio; andan arrangement for labeling observations of a selected mode; wherein said arrangement for labeling observations of a selected mode comprises; an arrangement for assigning semantic, multimedia content-based labels to segments of said observations of a selected mode; wherein said arrangement for assigning semantic, multimedia content-based labels is configured to; provide a label from a predefined set of multimedia content descriptors; and assign a new label not present in said predefined set of multimedia content descriptors; and an arrangement permitting a user to label audio while viewing video having; a check box for labeling foreground sounds; a check box for labeling background sounds; and a keyword text box arrangement; and an arrangement for storing said semantic, multimedia content-based labels with the multimedia content. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for managing multimedia content, said method comprising the steps of:
-
supplying multimedia content; permitting a selection, for observation, of at least one of the following modes associated with the multimedia content;
an audio portion that includes video; and
a video portion that includes audio;labeling observations of a selected mode; and assigning semantic, multimedia content-based labels to segments of said observations of a selected mode; wherein said assigning semantic, multimedia content-based labels comprises performing; selecting a label from a predefined set of multimedia content descriptors; and assigning a new label not present in said pre-defined set of multimedia content descriptors; labeling audio while viewing video via;
one or more check boxes configured for labeling foreground and background sounds, and a keyword text box; andstoring said semantic, multimedia content-based labels with the multimedia content.
-
Specification