System and method for generating a multimedia summary of multimedia streams
First Claim
1. A method for summarizing at least one multimedia stream (101, 102), the method comprising:
- a.) one of receiving and retrieving said at least one multimedia stream (101, 102) comprising video, audio and text information;
b.) dividing the at least one multimedia stream (101, 102) into a video sub-stream (303), an audio sub-stream (305) and a text sub-stream (307);
c.) identifying video, audio and text key elements from said video (303), audio (305) and text (307) sub-streams, respectively;
d.) computing an importance value for the identified video, audio and text key elements identified at said step (c);
e.) first filtering the identified video, audio and text key elements to exclude those key elements whose associated importance value is less than a pre-defined video, audio and text importance threshold, respectively; and
f.) second filtering the remaining key elements from said step (e) in accordance with a user profile;
g.) third filtering the remaining key elements from said step (f) in accordance with network and user device constraints; and
h.) outputting a multimedia summary (120) from the key elements remaining from said step (g).
6 Assignments
0 Petitions
Accused Products
Abstract
A system facilitates and enhances review of one or more multimedia input streams that includes some combination of video, audio and text information, generating a multimedia summary, thereby enabling a user to better browse and/or decide on viewing the multimedia input streams in their entirety. The multimedia summary is constructed automatically, based in part on system specifications, user specifications and network and device constraints. In a particular application of the invention, the input multimedia streams represent news broadcasts (e.g., television news program, video vault footage). In such a particular application, the invention can enable the user to automatically receive a summary of the news stream in accordance with previously provided user preferences and in accordance with prevailing network and user device constraints.
91 Citations
27 Claims
-
1. A method for summarizing at least one multimedia stream (101, 102), the method comprising:
-
a.) one of receiving and retrieving said at least one multimedia stream (101, 102) comprising video, audio and text information;
b.) dividing the at least one multimedia stream (101, 102) into a video sub-stream (303), an audio sub-stream (305) and a text sub-stream (307);
c.) identifying video, audio and text key elements from said video (303), audio (305) and text (307) sub-streams, respectively;
d.) computing an importance value for the identified video, audio and text key elements identified at said step (c);
e.) first filtering the identified video, audio and text key elements to exclude those key elements whose associated importance value is less than a pre-defined video, audio and text importance threshold, respectively; and
f.) second filtering the remaining key elements from said step (e) in accordance with a user profile;
g.) third filtering the remaining key elements from said step (f) in accordance with network and user device constraints; and
h.) outputting a multimedia summary (120) from the key elements remaining from said step (g). - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system (100) for summarizing at least one multimedia stream (101, 102), comprising:
- a modality recognition and division (MRAD) module (103) comprising a story segment identifier (SSI) module (103a), an audio identifier (AI) module (103b) and a text identifier (TI) module (103c), the MRAD module (103) communicatively coupled to a first external source (110) for receiving said at least one multimedia stream (101, 102), the MRAD module (103) communicatively coupled to a second external source (112) for receiving said at least one multimedia stream (101, 102), the MRAD module (103) dividing said at least one multimedia stream (101, 102) into a video (303), an audio (305) and a text (307) sub-stream and outputting said video (303), audio (305) and text (307) sub-streams to a KEI module (105), the KEI module (105) comprising a feature extraction (FE) module (107) and an importance value (IV) module (109) for identifying key elements from within said video (303), audio (305) and text (307) sub-streams and assigning importance values thereto, the KEI module (105) communicatively coupled to a key element filter (KEF) (111) for receiving the identified key elements and filtering said key elements that exceed a pre-determined threshold criteria, the KEF module (111) communicatively coupled to a user profile filter (UPF) (113) for receiving filtered key elements and further filtering said filtered key elements in accordance with a user profile, the UPF module (113) communicatively coupled to a network and device constraint (NADC) module (115), said NADC module (115) receiving said further filtered key elements and further filtering said further filtered key elements in accordance with network and/or user device constraints, the NADC module 115) outputting a multimedia summary (120) of said at least one multimedia stream (101, 102).
- View Dependent Claims (20, 21, 22, 23, 24, 25)
-
26. An article of manufacture for summarizing at least one multimedia stream (101, 102), comprising:
- a computer readable medium having computer readable code means embodied thereon, said computer readable program code means comprising;
an act of one of receiving and retrieving said at least one multimedia stream (101, 102) comprising video, audio and text information;
an act of dividing said at least one multimedia stream (101, 102) into a video sub-stream (303), an audio sub-stream (305) and a text sub-stream (307);
an act of identifying video, audio and text key elements from said video (303), audio (305) and text (307) sub-streams, respectively;
an act of computing an importance value for the identified video, audio and text key elements identified at said identification act;
an act of first filtering the identified video, audio and text key elements to exclude those key elements whose associated importance value is less than a pre-defined video, audio and text importance threshold, respectively; and
an act of second filtering the remaining key elements from said first filtering act in accordance with a user profile;
an act of third filtering the remaining key elements from said second filtering act in accordance with network and user device constraints; and
an act of outputting a multimedia summary (120) from the key elements remaining from said third filtering act. - View Dependent Claims (27)
- a computer readable medium having computer readable code means embodied thereon, said computer readable program code means comprising;
Specification