Fast computation of kernel descriptors
First Claim
1. A method for processing images in a video processing system comprising:
- accepting a video signal having a series of images acquired by a camera for processing and identifying a plurality of patches within said images;
reading a plurality of stored kernel tables, each kernel table representing a mapping from a corresponding feature to a vector of values;
computing a feature vector F(P) for each patch P of the plurality of patches, including computing one or more summations over locations z in the patch P of terms, each term being a product of terms including a term obtained by a lookup in a corresponding kernel table according to the location z and/or an attribute of the patch P at the location z; and
processing the images according to the computed feature vectors for the plurality of patches to provide a video processor output.
1 Assignment
0 Petitions
Accused Products
Abstract
An approach to computation of kernel descriptors is accelerated using precomputed tables. In one aspect, a fast algorithm for kernel descriptor computation that takes O(1) operations per pixel in each patch, based on pre-computed kernel values. This speeds up the kernel descriptor features under consideration, to levels that are comparable with D-SIFT and color SIFT, and two orders of magnitude faster than STIP and HoG3D. In some examples, kernel descriptors are applied to extract gradient, flow and texture based features for video analysis. In tests of the approach on a large database of internet videos used in the TRECVID MED 2011 evaluations, the flow based kernel descriptors are up to two orders of magnitude faster than STIP and HoG3D, and also produce significant performance improvements. Further, using features from multiple color planes produces small but consistent gains.
-
Citations
9 Claims
-
1. A method for processing images in a video processing system comprising:
-
accepting a video signal having a series of images acquired by a camera for processing and identifying a plurality of patches within said images; reading a plurality of stored kernel tables, each kernel table representing a mapping from a corresponding feature to a vector of values; computing a feature vector F(P) for each patch P of the plurality of patches, including computing one or more summations over locations z in the patch P of terms, each term being a product of terms including a term obtained by a lookup in a corresponding kernel table according to the location z and/or an attribute of the patch P at the location z; and processing the images according to the computed feature vectors for the plurality of patches to provide a video processor output. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for image processing in a video processing system comprising:
-
accepting an input video having a series of images acquired by a camera for processing, and identifying patches within said images; reading a plurality of stored kernel tables, each kernel table representing a mapping from a corresponding feature to a vector of values; repeatedly computing similarities between pairs of patches for images being processed, computation of a similarity between a patch P and a patch Q comprises computing for patch P one or more summations over locations z in the patch P of terms, each term being a product of terms including a term obtained by a lookup in a corresponding kernel table according to the location z and/or an attribute of the patch P at the location z, computing for patch Q one or more summations over locations z in the patch Q of terms, each term being a product of terms including a term obtained by a lookup in a corresponding kernel table according to the location z and/or an attribute of the patch Q at the location z, and combining the sums of the one or more summations for P and one or more summations for Q to determine a kernel descriptor similarity between P and Q; and providing a video processor output comprising a result of processing the images using the computed similarities between the patches. - View Dependent Claims (7)
-
-
8. A video processing system comprising:
-
a kernel preprocessor to provide a plurality of stored kernel tables, each kernel table representing a mapping from a corresponding feature to a vector of values; an input to accept an input video having a series of images acquired by a camera for processing, and identifying a plurality of patches within said images; a similarity computation module to compute a feature vector F(P) for each patch P of the plurality of patches, including computing one or more summations over locations z in the patch P of terms, each term being a product of terms including a term obtained by a lookup in a corresponding kernel table according to the location z and/or an attribute of the patch P at the location z; and process the images according to the computed feature vectors for the plurality of patches; and an output to provide a video processor output resulting from processing the images.
-
-
9. Software stored on a non-transitory computer-readable medium comprising instructions for causing a processor to:
-
accept a video input having a series of images acquired by a camera for processing, and identifying patches within said images; read a plurality of stored kernel tables, each kernel table representing a mapping from a corresponding feature to a vector of values; repeatedly compute similarities between pairs of patches for images being processed, computation of a similarity between a patch P and a patch Q comprises computing for patch P one or more summations over locations z in the patch P of terms, each term being a product of terms including a term obtained by a lookup in a corresponding kernel table according to the location z and/or an attribute of the patch P at the location z, computing for patch Q one or more summations over locations z in the patch Q of terms, each term being a product of terms including a term obtained by a lookup in a corresponding kernel table according to the location z and/or an attribute of the patch Q at the location z, and combining the sums of the one or more summations for P and one or more summations for Q to determine a kernel descriptor similarity between P and Q; and provide a video processor output comprising a result of processing the images using the computed similarities between the patches.
-
Specification