Systems and methods for smart media content thumbnail extraction
First Claim
Patent Images
1. A computer-readable medium comprising computer-program instructions executable by a processor for:
- generating program metadata from recorded video content, the program metadata comprising one or more key-frames from one or more corresponding shots;
identifying an objectively representative key-frame from the key-frames as a function of shot duration and frequency of appearance of key-frame content across multiple shots, the objectively representative key-frame being an image frame representative of the recorded video content;
clustering the shots based on appearance representation to generate one or more clusters; and
if a key-frame associated with a largest cluster of clusters includes a human face, then;
comparing the key-frame with other key-frames that include a human face; and
selecting a key-frame with a largest facial area as the objectively representative key-frame; and
creating a thumbnail from the objectively representative key-frame;
wherein identifying further comprises selecting the objectively representative key-frame as a further function of low motion intensity, wherein motion intensity is defined as;
4 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for smart media content thumbnail extraction are described. In one aspect program metadata is generated from recorded video content. The program metadata includes one or more key-frames from one or more corresponding shots. An objectively representative key-frame is identified from among the key-frames as a function of shot duration and frequency of appearance of key-frame content across multiple shots. The objectively representative key-frame is an image frame representative of the recorded video content. A thumbnail is created from the objectively representative key-frame.
179 Citations
22 Claims
-
1. A computer-readable medium comprising computer-program instructions executable by a processor for:
-
generating program metadata from recorded video content, the program metadata comprising one or more key-frames from one or more corresponding shots; identifying an objectively representative key-frame from the key-frames as a function of shot duration and frequency of appearance of key-frame content across multiple shots, the objectively representative key-frame being an image frame representative of the recorded video content; clustering the shots based on appearance representation to generate one or more clusters; and if a key-frame associated with a largest cluster of clusters includes a human face, then; comparing the key-frame with other key-frames that include a human face; and selecting a key-frame with a largest facial area as the objectively representative key-frame; and creating a thumbnail from the objectively representative key-frame; wherein identifying further comprises selecting the objectively representative key-frame as a further function of low motion intensity, wherein motion intensity is defined as; - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
employing a processor that executes instructions retained in a computer-readable medium, the instructions when executed by the processor implement at least the following operations; generating program metadata from recorded video content, the program metadata comprising one or more key-frames from one or more corresponding shots; identifying an objectively representative key-frame from the key-frames as a function of shot duration and frequency of appearance of key-frame content across multiple shots, the objectively representative key-frame being an image frame representative of the recorded video content; and creating a thumbnail from the objectively representative key-frame, wherein identifying further comprises; clustering the shots based on appearance representation to generate one or more clusters; and if a key-frame associated with a largest cluster of clusters includes a human face, then; comparing the key-frame with other key-frames with a human face; and selecting a key-frame with a largest facial area as the objectively representative key-frame for each shot of the shots;
(a) determining whether the shot is of short or long duration relative to other ones of the shots;
(b) evaluating whether a key-frame ofthe shot is of objective high image quality;
(c) detecting whether the shot represents commercial content;
in view of the determining, evaluating, and detecting, removing shot(s) of short duration, objectively low image quality, or that include commercial content from the program metadata. - View Dependent Claims (7, 8, 9, 10, 11, 15)
-
-
12. A computing device comprising:
-
a processor; and a memory coupled to the processor, the memory comprising computer-program instructions executable by the processor for; generating program metadata from recorded video content, the program metadata comprising one or more key-frames from one or more corresponding shots; identifying an objectively representative key-frame from the key-frames as a function of shot duration and frequency of appearance of key-frame content across multiple shots, the objectively representative key-frame being an image frame representative of the recorded video content; and creating a thumbnail from the objectively representative key-frame, wherein identifying further comprises; clustering the shots based on appearance representation to generate one or more clusters; and if a key-frame associated with a largest cluster of clusters includes a human face, then; comparing the key-frame with other key-frames that include a human face; and selecting a key-frame with a largest facial area as the objectively representative key-frame. - View Dependent Claims (13, 14, 16, 17, 18)
-
-
19. A computing device comprising:
-
generating means to generate program metadata from recorded video content, the program metadata comprising one or more key-frames from one or more corresponding shots; and identifying means to identify an objectively representative key-frame from the key-frames as a function of shot duration and frequency of appearance of key-frame content across multiple shots, the objectively representative key-frame being an image frame representative of the recorded video content, wherein the identifying means further comprises; clustering means to cluster the shots based on appearance representation to generate one or more clusters; and selecting means to select a key-frame with a largest facial area as the objectively representative key-frame if a key-frame associated with a largest cluster of clusters includes a human face for each shot of the shots;
(a) determining whether the shot is of short or long duration relative to other ones of the shots;
(b) evaluating whether a key-frame of the shot is of objective high image quality;
(c) detecting whether the shot represents commercial content;
in view of the determining, evaluating, and detecting, removing shot(s) of short duration, objectively low image quality, or that include commercial content from the program metadata. - View Dependent Claims (20, 21, 22)
-
Specification