System and method for relevance estimation in summarization of videos of multi-step activities
First Claim
Patent Images
1. A computer implemented method for identifying content relevance in a video stream, said method comprising:
- acquiring at a computer, video data from a video camera;
mapping extracted features of said acquired video data to a feature space to obtain a feature representation of said video data;
assigning said acquired video data, with a classifier, to at least one action class based on said feature representation of said video data, said classifier comprising at least one of a support vector machine, a neural network, a decision tree, an expectation-maximization algorithm, and a k-nearest neighbor clustering algorithm; and
determining a relevance of said acquired video data based on said at least one action class assigned, wherein determining a relevance of said acquired video data based on said at least one action class assigned comprises;
assigning said acquired video data a classification confidence score; and
converting said classification confidence score to a relevance score.
6 Assignments
0 Petitions
Accused Products
Abstract
A method and system for identifying content relevance comprises acquiring video data, mapping the acquired video data to a feature space to obtain a feature representation of the video data, assigning the acquired video data to at least one action class based on the feature representation of the video data, and determining a relevance of the acquired video data.
16 Citations
16 Claims
-
1. A computer implemented method for identifying content relevance in a video stream, said method comprising:
-
acquiring at a computer, video data from a video camera; mapping extracted features of said acquired video data to a feature space to obtain a feature representation of said video data; assigning said acquired video data, with a classifier, to at least one action class based on said feature representation of said video data, said classifier comprising at least one of a support vector machine, a neural network, a decision tree, an expectation-maximization algorithm, and a k-nearest neighbor clustering algorithm; and determining a relevance of said acquired video data based on said at least one action class assigned, wherein determining a relevance of said acquired video data based on said at least one action class assigned comprises; assigning said acquired video data a classification confidence score; and converting said classification confidence score to a relevance score. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for identifying content relevance, said system comprising:
-
a video acquisition module comprising a video camera for acquiring video data; a processor; a data bus coupled to said processor; and a computer-usable medium embodying computer program code, said computer-usable medium being coupled to said data bus, said computer program code comprising instructions executable by said processor and configured for; mapping extracted features of said acquired video data to a feature space to obtain a feature representation of said video data; assigning said acquired video data, via the use of a classifier, to at least one action class based on said feature representation of said video data, said classifier comprising at least one of a support vector machine, a neural network, a decision tree, an expectation-maximization algorithm, and a k-nearest neighbor clustering algorithm; and determining a relevance of said acquired video data based on said at least one action class assigned, wherein determining a relevance of said acquired video data based on said at least one action class assigned comprises; assigning said acquired video data a classification confidence score; and converting said classification confidence score to a relevance score. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory processor-readable medium storing computer code representing instructions to cause a process for identifying content relevance, said computer code comprising code to:
-
train a classifier to optimally discriminate between a plurality of different action classes according to said feature representations, said classifier comprising at least one of a support vector machine, a neural network, a decision tree, an expectation-maximization algorithm, and a k-nearest neighbor clustering algorithm; and in an online stage; acquire video data said video data comprising one of video acquired with an egocentric or wearable device;
video acquired with a vehicle-mounted device; and
surveillance or third-person view video;segment said video data into at least one of a series of single frames and a series of groups of frames; map extracted features of said acquired video data to a feature space to obtain a feature representation of said video data; assign said acquired video data, via the use of a classifier, to at least one action class based on said feature representation of said video data; and assign said acquired video data a classification confidence score and convert said classification confidence score to a relevance score to determine a relevance of said acquired video data based on the at least one action class assigned. - View Dependent Claims (16)
-
Specification