Computerized Prominent Character Recognition in Videos
First Claim
1. A method comprising:
- extracting, by one or more computing devices, feature points from video frames of a video file;
detecting, by at least one of the one or more computing devices, at least one face in at least a first video frame of the of the video frames;
inferring, by at least one of the one or more computing devices, the at least one face in a second video frame of the video frames, the inferring based at least in part on the feature points;
arranging, by at least one of the one or more computing devices, the video frames into groups; and
combining, by at least one of the one or more computing devices, two or more groups to create refined groups, the combining based at least in part on the two or more groups each including one or more video frames having at least one overlapping feature point associated with a detected face or an inferred face.
3 Assignments
0 Petitions
Accused Products
Abstract
Techniques for identifying prominent subjects in video content based on feature point extraction are described herein. Video files may be processed to detect faces on video frames and extract feature points from the video frames. Some video frames may include detected faces and extracted feature points and other video frames may not include detected faces. Based on the extracted feature points, faces may be inferred on video frames where no face was detected. The inferring may be based on feature points. Additionally, video frames may be arranged into groups and two or more groups may be merged. The merging may be based on some groups including video frames having overlapping feature points. The resulting groups each may identify a subject. A frequency representing a number of video frames where the subject appears may be determined for calculating a prominence score for each of the identified subjects in the video file.
64 Citations
20 Claims
-
1. A method comprising:
-
extracting, by one or more computing devices, feature points from video frames of a video file; detecting, by at least one of the one or more computing devices, at least one face in at least a first video frame of the of the video frames; inferring, by at least one of the one or more computing devices, the at least one face in a second video frame of the video frames, the inferring based at least in part on the feature points; arranging, by at least one of the one or more computing devices, the video frames into groups; and combining, by at least one of the one or more computing devices, two or more groups to create refined groups, the combining based at least in part on the two or more groups each including one or more video frames having at least one overlapping feature point associated with a detected face or an inferred face. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
-
memory; one or more processors operably coupled to the memory; and one or more modules stored in the memory and executable by the one or more processors, the one or more modules including; a face detection module configured to detect one or more faces associated with one or more subjects in video frames in video files; a feature detection module configured to extract feature points from the video frames and infer the one or more faces on the video frames; a grouping module configured to arrange individual video frames into groups based at least in part on face landmarks associated with the one or more faces, wherein individual groups represent an individual subject of the one or more subjects; and a scoring module configured to determining a prominence score associated with each individual subject. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. One or more computer-readable storage media encoded with instructions that, when executed by a processor, configure a computer to perform acts comprising:
-
processing individual video files of a plurality of video files, the processing comprising; detecting faces in some video frames of the individual video files; and extracting feature points from the video frames; inferring faces in individual video frames of the video frames, wherein no face was detected in the individual video frames, the inferring based at least in part on the feature points; arranging the individual video frames into a plurality of groups; combining two or more individual groups of the plurality of groups to create a set of refined groups, the combining based at least in part on the two or more individual groups including video frames having at least one overlapping feature point; identifying subjects associated with each of the refined groups; and determining a frequency associated with the subject, the frequency representing a number of video frames in which an individual subject of the subjects appears in a particular video file of the video files. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification