DETERMINING FEATURE VECTORS FOR VIDEO VOLUMES
First Claim
1. A computer-implemented method comprising:
- accessing a feature codebook comprising a set of representative feature vectors representing at least visual properties of digital videos;
identifying, in a plurality of digital videos, a plurality of candidate volumes representing spatio-temporal portions of the digital videos, wherein each of the candidate volumes corresponds to a contiguous sequence of spatial portions of the video frames having a starting time and an ending time;
associating features with each candidate volume of a plurality of the identified candidate volumes, the associating comprising;
identifying a plurality of temporal segments of the candidate volume;
for each of the identified temporal segments;
determining a feature vector from at least visual properties of the temporal segment, andassociating with the temporal segment a representative feature vector from the feature codebook that is most similar to the feature vector;
determining features for the candidate volume comprising at least one of;
temporal relationship features, andspatial relationship features; and
storing the determined features in association with the candidate volume.
2 Assignments
0 Petitions
Accused Products
Abstract
A volume identification system identifies a set of unlabeled spatio-temporal volumes within each of a set of videos, each volume representing a distinct object or action. The volume identification system further determines, for each of the videos, a set of volume-level features characterizing the volume as a whole. In one embodiment, the features are based on a codebook and describe the temporal and spatial relationships of different codebook entries of the volume. The volume identification system uses the volume-level features, in conjunction with existing labels assigned to the videos as a whole, to label with high confidence some subset of the identified volumes, e.g., by employing consistency learning or training and application of weak volume classifiers.
The labeled volumes may be used for a number of applications, such as training strong volume classifiers, improving video search (including locating individual volumes), and creating composite videos based on identified volumes.
14 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
accessing a feature codebook comprising a set of representative feature vectors representing at least visual properties of digital videos; identifying, in a plurality of digital videos, a plurality of candidate volumes representing spatio-temporal portions of the digital videos, wherein each of the candidate volumes corresponds to a contiguous sequence of spatial portions of the video frames having a starting time and an ending time; associating features with each candidate volume of a plurality of the identified candidate volumes, the associating comprising; identifying a plurality of temporal segments of the candidate volume; for each of the identified temporal segments; determining a feature vector from at least visual properties of the temporal segment, and associating with the temporal segment a representative feature vector from the feature codebook that is most similar to the feature vector; determining features for the candidate volume comprising at least one of; temporal relationship features, and spatial relationship features; and storing the determined features in association with the candidate volume. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method comprising:
-
identifying, in a digital video, a volume representing a spatio-temporal portion of the digital video, the volume corresponding to a contiguous sequence of spatial portions of frames of the video having a starting time and an ending time; identifying a set of temporal segments of the volume, each segment associated with a time within the video and with a spatial portion of a frame of the video that corresponds to the associated time; accessing a feature codebook comprising a set of representative feature vectors representing at least visual properties of digital videos; for each of a plurality of the temporal segments; deriving a feature vector from at least visual content of the temporal segment; associating, with the temporal segment, a feature vector from the feature codebook that is most similar to the feature vector derived from the audiovisual content; for the volume, and for at least a first one of the representative feature vectors and a second one of the representative feature vectors, determining at least one of; a temporal relationship between the first and second representative feature vectors, and a spatial relationship between the first and second representative feature vectors; and storing a degree of the temporal relationship as a feature representing the volume. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium storing executable computer program instructions comprising:
-
instructions for identifying, in a plurality of digital videos, a plurality of candidate volumes representing spatio-temporal segments of the digital videos, wherein each of the candidate volumes corresponds to a contiguous sequence of spatial portions of the video frames having a starting time and an ending time, and potentially represents a discrete object or action within the video frames; instructions for forming a feature codebook based on the identified plurality of candidate volumes, the forming comprising; dividing each of a plurality of the candidate volume into temporal segments; for each segment of the determined temporal segments, determining a segment feature vector representing at least visual properties of the segment; and forming the feature codebook by clustering the segment feature vectors into a set of representative feature vectors; instructions for associating features with each candidate volume of a plurality of the identified candidate volumes, the associating comprising; dividing the candidate volume into temporal segments; for each temporal segment; determining a feature vector from at least visual properties of the temporal segment; associating with the temporal segment a representative feature vector from the feature codebook that is most similar to the feature vector; determining features for the candidate volume comprising at least one of; temporal relationship features determined by comparing the times of occurrence within the candidate volume of the representative feature vectors associated with the temporal segments, the time of occurrence of a representative feature vector being the time of occurrence of the temporal segment with which it is associated, and spatial relationship features determined by comparing spatial locations of occurrence, within the temporal segments of the candidate volume, of the representative feature vectors associated with the temporal segments; and storing the determined features in association with the candidate volume. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification