METHODS AND APPARATUSES FOR VIDEO SEGMENTATION, CLASSIFICATION, AND RETRIEVAL USING IMAGE CLASS STATISTICAL MODELS
First Claim
1. A method for selecting a d-entry feature set for video classification from t training images, each of the t training images having v rows and h columns of sub-images, the method comprising the steps of:
- computing t transform matrices by performing a transform on each of the t training images, each transform matrix having v rows and h columns of coefficient positions, each coefficient position having a transform coefficient associated with it; and
selecting d coefficient positions as the d-entry feature set based upon the transform coefficients in the t transform matrices.
7 Assignments
0 Petitions
Accused Products
Abstract
Techniques for classifying video frames using statistical models of transform coefficients are disclosed. After optionally being decimated in time and space, image frames are transformed using a discrete cosine transform or Hadamard transform. The methods disclosed model image composition and operate on grayscale images. The resulting transform matrices are reduced using truncation, principal component analysis, or linear discriminant analysis to produce feature vectors. Feature vectors of training images for image classes are used to compute image class statistical models. Once image class statistical models are derived, individual frames are classified by the maximum likelihood resulting from the image class statistical models. Thus, the probabilities that a feature vector derived from a frame would be produced from each of the image class statistical models are computed. The frame is classified into the image class corresponding to the image class statistical model which produced the highest probability for the feature vector derived from the frame. Optionally, frame sequence information is taken into account by applying a hidden Markov model to represent image class transitions from the previous frame to the current frame. After computing all class probabilities for all frames in the video or sequence of frames using the image class statistical models and the image class transition probabilities, the final class is selected as having the maximum likelihood. Previous frames are selected in reverse order based upon their likelihood given determined current states.
-
Citations
42 Claims
-
1. A method for selecting a d-entry feature set for video classification from t training images, each of the t training images having v rows and h columns of sub-images, the method comprising the steps of:
-
computing t transform matrices by performing a transform on each of the t training images, each transform matrix having v rows and h columns of coefficient positions, each coefficient position having a transform coefficient associated with it; and
selecting d coefficient positions as the d-entry feature set based upon the transform coefficients in the t transform matrices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of generating a video image class statistical model, the method comprising the steps of:
-
determining a first d-entry mean vector having d mean positions, each mean position having a mean associated with it, wherein each mean position corresponds to a transform coefficient position in a transform matrix; and
determining a first d-entry variance feature vector having d variance positions, each variance position having a variance associated with it, wherein each variance position corresponds to one of the mean positions. - View Dependent Claims (11, 13, 14, 15, 17, 18, 19)
-
-
12. A method of generating s video image class statistical models, the method comprising the steps of:
-
for each of the s video image classes, determining a d-entry mean vector having d mean positions, each mean position having a mean associated with it, wherein each mean position corresponds to a transform coefficient position in a transform matrix; and
for each of the s video image classes, determining a d-entry variance feature vector having d variance positions, each variance position having a variance associated with it, wherein each variance position corresponds to one of the mean positions.
-
-
16. A method for classifying a video frame into one of s video image classes, the method comprising the steps of:
-
retrieving a d-entry feature vector corresponding to the video frame;
for each of the s video image classes, using the d-entry feature vector to compute an image class probability of the d-entry feature vector being produced by a corresponding one of s image class statistical models; and
classifying the video frame into the video image class corresponding to a maximum image class probability.
-
-
20. A method for classifying a video frame into one of s video image classes, the method comprising the steps of:
-
retrieving a d-entry feature vector corresponding to the video frame; and
for each of the s video image classes, using the d-entry feature vector, s previous image class probabilities, a corresponding one of s class transition probability vectors, and a corresponding one of s image class statistical models to compute an image class probability for the d-entry feature vector. - View Dependent Claims (21, 22, 23, 26, 27, 28)
-
-
24. A method for segmenting a series of video frames into one of s video image classes, the method comprising the steps of:
-
for each of the video frames in the series, retrieving a d-entry feature vector corresponding to the video frame; and
for each of the s video image classes, using the d-entry feature vector, s previous image class probabilities, a corresponding one of s class transition probability vectors, and a corresponding one of s image class statistical models to compute an image class probability for the d-entry feature vector; and
generating a previous image class pointer corresponding to a maximum product of a previous image class probability and class transition probability;
for a last video frame in the series, classifying the video frame into the video image class corresponding to a maximum image class probability; and
for each of the video frames except the last video frame in the series, classifying a previous frame into the video image class indicated by the previous image class pointer.
-
-
25. A method of determining similarity of a video frame using an image class statistical model, comprising the steps of:
-
retrieving a feature vector corresponding to the video frame;
retrieving a mean vector of the image class statistical model; and
subtracting the mean vector from the feature vector to produce a difference vector.
-
-
29. A computer readable storage medium, comprising:
-
computer readable program code embodied on said computer readable storage medium, said computer readable program code for programming a computer to perform a method for selecting a d-entry feature set for video classification from t training images, each of the t training images having v rows and h columns of sub-images, the method comprising the steps of;
computing t transform matrices by performing a transform on each of the t training images, each transform matrix having v rows and h columns of coefficient positions, each coefficient position having a transform coefficient associated with it; and
selecting d coefficient positions as the d-entry feature set based upon the transform coefficients in the t transform matrices.
-
-
30. A computer readable storage medium, comprising:
-
computer readable program code embodied on said computer readable storage medium, said computer readable program code for programming a computer to perform a method of generating a video image class statistical model, the method comprising the steps of;
determining a first d-entry mean vector having d mean positions, each mean position having a mean associated with it, wherein each mean position corresponds to a transform coefficient position in a transform matrix; and
determining a first d-entry variance feature vector having d variance positions, each variance position having a variance associated with it, wherein each variance position corresponds to one of the mean positions.
-
-
31. A computer readable storage medium, comprising:
-
computer readable program code embodied on said computer readable storage medium, said computer readable program code for programming a computer to perform a method of generating s video image class statistical models, the method comprising the steps of;
for each of the s video image classes, determining a d-entry mean vector having d mean positions, each mean position having a mean associated with it, wherein each mean position corresponds to a transform coefficient position in a transform matrix; and
for each of the s video image classes, determining a d-entry variance feature vector having d variance positions, each variance position having a variance associated with it, wherein each variance position corresponds to one of the mean positions.
-
-
32. A computer readable storage medium, comprising:
-
computer readable program code embodied on said computer readable storage medium, said computer readable program code for programming a computer to perform a method for classifying a video frame into one of s video image classes, the method comprising the steps of;
retrieving a d-entry feature vector corresponding to the video frame;
for each of the s video image classes, using the d-entry feature vector to compute an image class probability of the d-entry feature vector being produced by a corresponding one of s image class statistical models; and
classifying the video frame into the video image class corresponding to a maximum image class probability.
-
-
33. A computer readable storage medium, comprising:
-
computer readable program code embodied on said computer readable storage medium, said computer readable program code for programming a computer to perform a method for classifying a video frame into one of s video image classes, the method comprising the steps of;
retrieving a d-entry feature vector corresponding to the video frame; and
for each of the s video image classes, using the d-entry feature vector, s previous image class probabilities, a corresponding one of s class transition probability vectors, and a corresponding one of s image class statistical models to compute an image class probability for the d-entry feature vector.
-
-
34. A computer readable storage medium, comprising:
-
computer readable program code embodied on said computer readable storage medium, said computer readable program code for programming a computer to perform a method for segmenting a series of video frames into one of s video image classes, the method comprising the steps of;
for each of the video frames in the series, retrieving a d-entry feature vector corresponding to the video frame; and
for each of the s video image classes, using the d-entry feature vector, s previous image class probabilities, a corresponding one of s class transition probability vectors, and a corresponding one of s image class statistical models to compute an image class probability for the d-entry feature vector; and
generating a previous image class pointer corresponding to a maximum product of a previous image class probability and class transition probability;
for a last video frame in the series, classifying the video frame into the video image class corresponding to a maximum image class probability; and
for each of the video frames except the last video frame in the series, classifying a previous frame into the video image class indicated by the previous image class pointer.
-
-
35. A computer readable storage medium, comprising:
-
computer readable program code embodied on said computer readable storage medium, said computer readable program code for programming a computer to perform a method of determining similarity of a video frame using an image class statistical model, comprising the steps of;
retrieving a feature vector corresponding to the video frame;
retrieving a mean vector of the image class statistical model; and
subtracting the mean vector from the feature vector to produce a difference vector.
-
-
36. A computer system, comprising:
-
a processor; and
a processor readable storage medium having processor readable program code embodied on said processor readable storage medium, said processor readable program code for programming the computer system to perform a method for selecting a d-entry feature set for video classification from t training images, each of the t training images having v rows and h columns of sub-images, the method comprising the steps of;
computing t transform matrices by performing a transform on each of the t training images, each transform matrix having v rows and h columns of coefficient positions, each coefficient position having a transform coefficient associated with it; and
selecting d coefficient positions as the d-entry feature set based upon the transform coefficients in the t transform matrices.
-
-
37. A computer system, comprising:
-
a processor; and
a processor readable storage medium having processor readable program code embodied on said processor readable storage medium, said processor readable program code for programming the computer system to perform a method of generating a video image class statistical model, the method comprising the steps of;
determining a first d-entry mean vector having d mean positions, each mean position having a mean associated with it, wherein each mean position corresponds to a transform coefficient position in a transform matrix; and
determining a first d-entry variance feature vector having d variance positions, each variance position having a variance associated with it, wherein each variance position corresponds to one of the mean positions.
-
-
38. A computer system, comprising:
-
a processor; and
a processor readable storage medium having processor readable program code embodied on said processor readable storage medium, said processor readable program code for programming the computer system to perform a method of generating s video image class statistical models, the method comprising the steps of;
for each of the s video image classes, determining a d-entry mean vector having d mean positions, each mean position having a mean associated with it, wherein each mean position corresponds to a transform coefficient position in a transform matrix; and
for each of the s video image classes, determining a d-entry variance feature vector having d variance positions, each variance position having a variance associated with it, wherein each variance position corresponds to one of the mean positions.
-
-
39. A computer system, comprising:
-
a processor; and
a processor readable storage medium having processor readable program code embodied on said processor readable storage medium, said processor readable program code for programming the computer system to perform a method for classifying a video frame into one of s video image classes, the method comprising the steps of;
retrieving a d-entry feature vector corresponding to the video frame;
for each of the s video image classes, using the d-entry feature vector to compute an image class probability of the d-entry feature vector being produced by a corresponding one of s image class statistical models; and
classifying the video frame into the video image class corresponding to a maximum image class probability.
-
-
40. A computer system, comprising:
-
a processor; and
a processor readable storage medium having processor readable program code embodied on said processor readable storage medium, said processor readable program code for programming the computer system to perform a method for classifying a video frame into one of s video image classes, the method comprising the steps of;
retrieving a d-entry feature vector corresponding to the video frame; and
for each of the s video image classes, using the d-entry feature vector, s previous image class probabilities, a corresponding one of s class transition probability vectors, and a corresponding one of s image class statistical models to compute an image class probability for the d-entry feature vector.
-
-
41. A computer system, comprising:
-
a processor; and
a processor readable storage medium having processor readable program code embodied on said processor readable storage medium, said computer readable program code for programming the computer system to perform a method for segmenting a series of video frames into one of s video image classes, the method comprising the steps of;
for each of the video frames in the series, retrieving a d-entry feature vector corresponding to the video frame; and
for each of the s video image classes, using the d-entry feature vector, s previous image class probabilities, a corresponding one of s class transition probability vectors, and a corresponding one of s image class statistical models to compute an image class probability for the d-entry feature vector; and
generating a previous image class pointer corresponding to a maximum product of a previous image class probability and class transition probability;
for a last video frame in the series, classifying the video frame into the video image class corresponding to a maximum image class probability; and
for each of the video frames except the last video frame in the series, classifying a previous frame into the video image class indicated by the previous image class pointer.
-
-
42. A computer system, comprising:
-
a processor; and
a processor readable storage medium having a processor readable program code embodied on said processor readable storage medium, said processor readable program code for programming the computer system to perform a method of determining similarity of a video frame using an image class statistical model, comprising the steps of;
retrieving a feature vector corresponding to the video frame;
retrieving a mean vector of the image class statistical model; and
subtracting the mean vector from the feature vector to produce a difference vector.
-
Specification