Systems and methods for semantically classifying and normalizing shots in video
First Claim
1. A method for classifying a video file according to one or more scene classes, the video file including a plurality of frames, where each frame of the plurality of frames includes a plurality of pixels, and where each pixel of the plurality of pixels is associated with a vector of material classification scores describing material content in its respective frame, comprising:
- for each frame of the plurality of frames, generating one or more scene classification scores associated with each of the one or more scene classes by;
dividing the frame into a plurality of grid cells;
retrieving the vector of material classification scores for each pixel in the frame;
for each grid cell of the plurality of grid cells, averaging the material classification scores across each pixel in the grid cell to form a material occurrence vector for the grid cell;
concatenating the material occurrence vectors for each grid cell of the plurality of grid cells to generate a material arrangement vector for the frame; and
based on the material arrangement vector generated for the frame, generating the one or more scene classification scores associated with each of the one or more scene classes using one or more scene classifiers;
based on the one or more scene classification scores generated for each frame of the plurality of frames, generating a representative scene classification score for each of the one or more scene classes; and
for each of the generated representative scene classification scores that is above a predetermined threshold value, labeling the video file according to the respective scene classes associated with the scene classification scores that are above the predetermined threshold value.
10 Assignments
0 Petitions
Accused Products
Abstract
The present disclosure relates to systems and methods for classifying videos based on video content. For a given video file including a plurality of frames, a subset of frames is extracted for processing. Frames that are too dark, blurry, or otherwise poor classification candidates are discarded from the subset. Generally, material classification scores that describe type of material content likely included in each frame are calculated for the remaining frames in the subset. The material classification scores are used to generate material arrangement vectors that represent the spatial arrangement of material content in each frame. The material arrangement vectors are subsequently classified to generate a scene classification score vector for each frame. The scene classification results are averaged (or otherwise processed) across all frames in the subset to associate the video file with one or more predefined scene categories related to overall types of scene content of the video file.
22 Citations
25 Claims
-
1. A method for classifying a video file according to one or more scene classes, the video file including a plurality of frames, where each frame of the plurality of frames includes a plurality of pixels, and where each pixel of the plurality of pixels is associated with a vector of material classification scores describing material content in its respective frame, comprising:
for each frame of the plurality of frames, generating one or more scene classification scores associated with each of the one or more scene classes by; dividing the frame into a plurality of grid cells; retrieving the vector of material classification scores for each pixel in the frame; for each grid cell of the plurality of grid cells, averaging the material classification scores across each pixel in the grid cell to form a material occurrence vector for the grid cell; concatenating the material occurrence vectors for each grid cell of the plurality of grid cells to generate a material arrangement vector for the frame; and based on the material arrangement vector generated for the frame, generating the one or more scene classification scores associated with each of the one or more scene classes using one or more scene classifiers; based on the one or more scene classification scores generated for each frame of the plurality of frames, generating a representative scene classification score for each of the one or more scene classes; and for each of the generated representative scene classification scores that is above a predetermined threshold value, labeling the video file according to the respective scene classes associated with the scene classification scores that are above the predetermined threshold value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
16. One or more non-transitory storage media storing instructions which, when executed by one or more computing devices, cause:
-
classifying a video file according to one or more scene classes, the video file including a plurality of frames, where each frame of the plurality of frames includes a plurality of pixels, and where each pixel of the plurality of pixels is associated with a vector of material classification scores describing material content in its respective frame; for each frame of the plurality of frames, generating one or more scene classification scores associated with each of the one or more scene classes by; dividing the frame into a plurality of grid cells; retrieving the vector of material classification scores for each pixel in the frame; for each grid cell of the plurality of grid cells, averaging the material classification scores across each pixel in the grid cell to form a material occurrence vector for the grid cell; concatenating the material occurrence vectors for each grid cell of the plurality of grid cells to generate a material arrangement vector for the frame; and based on the material arrangement vector generated for the frame, generating the one or more scene classification scores associated with each of the one or more scene classes for using one or more scene classifiers; based on the one or more scene classification scores generated for each frame of the plurality of frames, generating a representative scene classification score for each of the one or more scene classes; and for each of the generated representative scene classification scores that is above a predetermined threshold value, labeling the video file according to the respective scene classes associated with the scene classification scores that are above the predetermined threshold value. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. An apparatus comprising:
-
a subsystem, implemented at least partially in hardware, that classifies a video file according to one or more scene classes, the video file including a plurality of frames, where each frame of the plurality of frames includes a plurality of pixels, and where each pixel of the plurality of pixels is associated with a vector of material classification scores describing material content in its respective frame; a subsystem, implemented at least partially in hardware, that, for each frame of the plurality of frames, generates one or more scene classification scores associated with each of the one or more scene classes by; dividing the frame into a plurality of grid cells; retrieving the vector of material classification scores for each pixel in the frame; for each grid cell of the plurality of grid cells, averaging the material classification scores across each pixel in the grid cell to form a material occurrence vector for the grid cell; concatenating the material occurrence vectors for each grid cell of the plurality of grid cells to generate a material arrangement vector for the frame; and based on the material arrangement vector generated for the frame, generating the one or more scene classification scores associated with each of the one or more scene classes for using one or more scene classifiers; a subsystem, implemented at least partially in hardware, that, based on the one or more scene classification scores generated for each frame of the plurality of frames, generates a representative scene classification score for each of the one or more scene classes; and a subsystem, implemented at least partially in hardware, that, for each of the generated representative scene classification scores that is above a predetermined threshold value, labels the video file according to the respective scene classes associated with the scene classification scores that are above the predetermined threshold value.
-
Specification