Tree structured CRF with unary potential function using action unit features of other segments as context feature
First Claim
1. A method of determining a classification of a composite action including a plurality of action units in a video clip, the method comprising:
- extracting a plurality of features from the video clip;
determining a corresponding feature in the plurality of features for each of temporal segments of the video clip;
determining an initial estimate of an action unit for each of the temporal segments using a potential function for each segment modeling dependency between a concatenation of features and a classification of a corresponding action unit by inputting a feature from a current temporal segment and a feature from at least one of preceding temporal segments or subsequent temporal segments as the concatenation of features;
aggregating the potential functions into a probability distribution; and
determining the classification of the composite action using the probability distribution by jointly inferring the classification of the composite action and classifications of the action units of the temporal segments based on the initial estimate of each action unit for each of the temporal segments.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of determining a composite action from a video clip, using a conditional random field (CRF), the method includes determining a plurality of features from the video clip, each of the features having a corresponding temporal segment from the video clip. The method may continue by determining, for each of the temporal segments corresponding to one of the features, an initial estimate of an action unit label from a corresponding unary potential function, the corresponding unary potential function having as ordered input the plurality of features from a current temporal segment and at least one other of the temporal segments. The method may further include determining the composite action by jointly optimizing the initial estimate of the action unit labels.
-
Citations
9 Claims
-
1. A method of determining a classification of a composite action including a plurality of action units in a video clip, the method comprising:
-
extracting a plurality of features from the video clip; determining a corresponding feature in the plurality of features for each of temporal segments of the video clip; determining an initial estimate of an action unit for each of the temporal segments using a potential function for each segment modeling dependency between a concatenation of features and a classification of a corresponding action unit by inputting a feature from a current temporal segment and a feature from at least one of preceding temporal segments or subsequent temporal segments as the concatenation of features; aggregating the potential functions into a probability distribution; and determining the classification of the composite action using the probability distribution by jointly inferring the classification of the composite action and classifications of the action units of the temporal segments based on the initial estimate of each action unit for each of the temporal segments. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer readable medium having a computer program recorded on the computer readable medium, the computer program being executable by a computer system to perform a method of determining a classification of a composite action including a plurality of action units in a video clip, the method comprising:
-
extracting a plurality of features from the video clip; determining a corresponding feature in the plurality of features for each of temporal segments of the video clip; determining an initial estimate of an action unit for each of the temporal segments using a potential function for each segment modeling dependency between a concatenation of features and a classification of a corresponding action unit by inputting a feature from a current temporal segment and a feature from at least one of preceding temporal segments or subsequent temporal segments as the concatenation of features; aggregating the potential functions into a probability distribution; and determining the classification of the composite action using the probability distribution by jointly inferring the classification of the composite action and classifications of the action units of the temporal segments based on the initial estimate of each action unit for each of the temporal segments.
-
-
9. A computer system, comprising:
-
a processor; a memory having a computer program recorded thereon, the memory being in communication with the processor; the processor executing the computer program to perform a method of determining a classification of a composite action including a plurality of action units in a video clip, the method comprising; extracting a plurality of features from the video clip; determining a corresponding feature in the plurality of features for each of temporal segments of the video dip; determining an initial estimate of an action unit for each of the temporal segments using a potential function for each segment modeling dependency between a concatenation of features and a classification of a corresponding action unit by inputting a feature from a current temporal segment and a feature from at least one of preceding temporal segments or subsequent temporal segments as the concatenation of features; aggregating the potential functions into a probability distribution; and determining the classification of the composite action using the probability distribution by jointly inferring the classification of the composite action and classifications of the action units of the temporal segments based on the initial estimate of each action unit for each of the temporal segments.
-
Specification