Cluster-based video classification
First Claim
Patent Images
1. A method, executed by a computer system, of training a classifier for a video category, the method comprising:
- accessing a training set of items for a category, the training set comprising a first set of videos labeled with the category;
accessing a second set of unlabeled videos not labeled with the category;
forming a cluster for the category, the cluster comprising labeled videos from the training set;
generating a supplemental training set for the category, the supplemental training set comprising the first set of videos labeled with the category and a subset of the second set of unlabeled videos not labeled with the category, generating the supplemental training set comprising;
adding to the cluster unlabeled videos from the second set that have been co-watched with one or more labeled videos in the cluster by adding the unlabeled videos to nodes of a graph representing the cluster for the category, the nodes having edges connecting with nodes of videos that are co-watched with the added unlabeled videos and the edges having weights based on the co-watch relationships;
determining cluster scores for the unlabeled videos added to the cluster responsive to the weights of the edges, the cluster scores representing likelihoods that the unlabeled videos belong to the category and propagated from the labeled videos to the unlabeled videos; and
pruning by removing an unlabeled video from the cluster if the cluster score of the unlabeled video is outside a threshold;
training a classifier for the category using the supplemental training set for the category; and
storing the classifier.
2 Assignments
0 Petitions
Accused Products
Abstract
A classifier training system trains unified classifiers for categorizing videos representing different categories of a category graph. The unified classifiers unify the outputs of a number of separate initial classifiers trained from disparate subsets of a training set of media items. The training process takes into account the relationships that exist between the various categories of the category graph by relating scores associated with related categories, thus enhancing the accuracy of the unified classifiers.
34 Citations
33 Claims
-
1. A method, executed by a computer system, of training a classifier for a video category, the method comprising:
-
accessing a training set of items for a category, the training set comprising a first set of videos labeled with the category; accessing a second set of unlabeled videos not labeled with the category; forming a cluster for the category, the cluster comprising labeled videos from the training set; generating a supplemental training set for the category, the supplemental training set comprising the first set of videos labeled with the category and a subset of the second set of unlabeled videos not labeled with the category, generating the supplemental training set comprising; adding to the cluster unlabeled videos from the second set that have been co-watched with one or more labeled videos in the cluster by adding the unlabeled videos to nodes of a graph representing the cluster for the category, the nodes having edges connecting with nodes of videos that are co-watched with the added unlabeled videos and the edges having weights based on the co-watch relationships; determining cluster scores for the unlabeled videos added to the cluster responsive to the weights of the edges, the cluster scores representing likelihoods that the unlabeled videos belong to the category and propagated from the labeled videos to the unlabeled videos; and pruning by removing an unlabeled video from the cluster if the cluster score of the unlabeled video is outside a threshold; training a classifier for the category using the supplemental training set for the category; and storing the classifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system for training a classifier for a video category, the system comprising:
-
a non-transitory computer-readable storage medium having executable computer program instructions embodied therein; and a computer processor, the computer processor configured to execute the computer program instructions to; access a training set of items for a category, the training set comprising a first set of videos labeled with the category; access a second set of unlabeled videos not labeled with the category; form a cluster for the category, the cluster comprising labeled videos from the training set; generate a supplemental training set for the category, the supplemental training set comprising the first set of videos labeled with the category and a subset of the second set of unlabeled videos not labeled with the category, generating the supplemental training set comprising; adding to the cluster unlabeled videos from the second set that have been co-watched with one or more labeled videos in the cluster by adding the unlabeled videos to nodes of a graph representing the cluster for the category, the nodes having edges connecting with nodes of videos that are co-watched with the added unlabeled videos and the edges having weights based on the co-watch relationships; determining cluster scores for the unlabeled videos added to the cluster responsive to the weights of the edges, the cluster scores representing likelihoods that the unlabeled videos belong to the category and propagated from the labeled videos to the unlabeled videos; and pruning by removing an unlabeled video from the cluster if the cluster score of the unlabeled video is outside a threshold; train a classifier for the category using the supplemental training set for the category; and store the classifier. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A non-transitory computer-readable storage medium having executable computer program instructions embodied therein for training a classifier for a video category, actions of the computer program instructions comprising:
-
accessing a training set of items for a category, the training set comprising a first set of videos labeled with the category; accessing a second set of unlabeled videos not labeled with the category; forming a cluster for the category, the cluster comprising labeled videos from the training set; generating a supplemental training set for the category, the supplemental training set comprising the first set of videos labeled with the category and a subset of the second set of unlabeled videos not labeled with the category, generating the supplemental training set comprising; adding to the cluster unlabeled videos from the second set that have been co-watched with one or more labeled videos in the cluster by adding the unlabeled videos to nodes of a graph representing the cluster for the category, the nodes having edges connecting with nodes of videos that are co-watched with the added unlabeled videos and the edges having weights based on the co-watch relationships; determining cluster scores for the unlabeled videos added to the cluster responsive to the weights of the edges, the cluster scores representing likelihoods that the unlabeled videos belong to the category and propagated from the labeled videos to the unlabeled videos; and pruning by removing an unlabeled video from the cluster if the cluster score of the unlabeled video is outside a threshold; training a classifier for the category using the supplemental training set for the category; and storing the classifier. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
Specification