Automatic face annotation method and system
First Claim
1. An automatic face annotation method, comprising:
- dividing an input video into different sets of frames;
extracting temporal and spatial information by employing camera take and shot boundary detection algorithms on the different sets of frames of the input video;
collecting weakly labeled data by crawling weakly labeled face images from social networks;
applying face detection together with an iterative refinement clustering algorithm to remove noise of the collected weakly labeled data;
generating a labeled database containing refined labeled images as training data;
based on the refined labeled images stored in the labeled database, finding and labeling exact frames containing one or more face images in the input video matching any of the refined labeled images in the labeled database;
labeling remaining unlabeled face tracks in the input video by a semi-supervised learning algorithm to annotate the face images in the input video; and
outputting the input video containing the annotated face images.
1 Assignment
0 Petitions
Accused Products
Abstract
An automatic face annotation method is provided. The method includes dividing an input video into different sets of frames, extracting temporal and spatial information by employing camera take and shot boundary detection algorithms on the different sets of frames, and collecting weakly labeled data by crawling weakly labeled face images from social networks. The method also includes applying face detection together with an iterative refinement clustering algorithm to remove noise of the collected weakly labeled data, generating a labeled database containing refined labeled images, finding and labeling exact frames containing one or more face images in the input video matching any of the refined labeled images based on the labeled database, labeling remaining unlabeled face tracks in the input video by a semi-supervised learning algorithm to annotate the face images in the input video, and outputting the input video containing the annotated face images.
-
Citations
20 Claims
-
1. An automatic face annotation method, comprising:
-
dividing an input video into different sets of frames; extracting temporal and spatial information by employing camera take and shot boundary detection algorithms on the different sets of frames of the input video; collecting weakly labeled data by crawling weakly labeled face images from social networks; applying face detection together with an iterative refinement clustering algorithm to remove noise of the collected weakly labeled data; generating a labeled database containing refined labeled images as training data; based on the refined labeled images stored in the labeled database, finding and labeling exact frames containing one or more face images in the input video matching any of the refined labeled images in the labeled database; labeling remaining unlabeled face tracks in the input video by a semi-supervised learning algorithm to annotate the face images in the input video; and outputting the input video containing the annotated face images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 18, 19, 20)
-
-
11. An automatic face annotation system, comprising:
-
a camera take detection module configured to extract temporal and spatial information by employing camera take and shot boundary detection algorithms on different sets of frames of an input video; a social web data analysis module configured to collect weakly labeled data by crawling weakly labeled face images from social networks, apply face detection together with an iterative refinement clustering algorithm to remove noise and generate a labeled database containing refined labeled images as training data; a face matching module configured to, based on the refined labeled images stored in the labeled database, find and label exact frames containing one or more face images in the input video matching any of the refined labeled images in the labeled database; an active semi-supervised learning module configured to label remaining unlabeled face tracks in the input video by a semi-supervised learning algorithm to annotate the face images in the input video; and an output module configured to output the input video containing the annotated face images. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification