Background Detection As An Optimization For Gesture Recognition
First Claim
1. A computer-implemented image processing method for recognizing a gesture made by an object, the method comprising:
- receiving, using at least one processing circuit, a plurality of image frames of a video, wherein each pixel of each of the plurality of image frames has a blue channel, a green channel, a red channel, and an alpha channel;
constructing, using at least one processing circuit, a plurality of statistical models of the plurality of image frames at a plurality of pixel granularity levels, the plurality of statistical models including;
at a first pixel granularity level, a spatio-temporal (S-T) histogram for each of the pixels from the plurality of image frames, wherein a first axis of the S-T histogram represents channel value bins, and wherein a second axis of the S-T histogram represents counts of image frames per bin;
at a second pixel granularity level higher than the first pixel granularity level, aggregate histograms for the blue, green, and red channels, respectively, based on aggregated pixel values at the second pixel granularity level;
constructing, using at least one processing circuit, a plurality of probabilistic models of an input image frame at a plurality of channel granularity levels based on the plurality of statistical models, the plurality of probabilistic models including;
at a first channel granularity level, a probability image from each of the S-T histogram and the aggregate histograms, wherein each of the probability images comprises a plurality of pixels each indicating a probability of a corresponding pixel in the input image being a background pixel;
at a second channel granularity level higher than the first channel granularity level, compact probability images from the probability images at the first channel granularity level;
merging the compact probability images based on a weighted average to form a single probability image;
subsampling pixels in the single probability image;
determining background pixels, based on a probability threshold value, from the subsampled single probability image; and
determining whether the plurality of image frames, when examined in a particular sequence, conveys a gesture by the object.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems are provided allowing for background identification and gesture recognition in video images. A computer-implemented image processing method includes: receiving, using at least one processing circuit, a plurality of image frames of a video; constructing, using at least one processing circuit, a plurality of statistical models of the plurality of image frames at a plurality of pixel granularity levels; constructing, using at least one processing circuit, a plurality of probabilistic models of an input image frame at a plurality of channel granularity levels based on the plurality of statistical models; merging at least some of the plurality of probabilistic models based on a weighted average to form a single probability image; determining background pixels, based on a probability threshold value, from the single probability image; and determining whether the plurality of image frames, when examined in a particular sequence, conveys a gesture by the object.
32 Citations
27 Claims
-
1. A computer-implemented image processing method for recognizing a gesture made by an object, the method comprising:
-
receiving, using at least one processing circuit, a plurality of image frames of a video, wherein each pixel of each of the plurality of image frames has a blue channel, a green channel, a red channel, and an alpha channel; constructing, using at least one processing circuit, a plurality of statistical models of the plurality of image frames at a plurality of pixel granularity levels, the plurality of statistical models including; at a first pixel granularity level, a spatio-temporal (S-T) histogram for each of the pixels from the plurality of image frames, wherein a first axis of the S-T histogram represents channel value bins, and wherein a second axis of the S-T histogram represents counts of image frames per bin; at a second pixel granularity level higher than the first pixel granularity level, aggregate histograms for the blue, green, and red channels, respectively, based on aggregated pixel values at the second pixel granularity level; constructing, using at least one processing circuit, a plurality of probabilistic models of an input image frame at a plurality of channel granularity levels based on the plurality of statistical models, the plurality of probabilistic models including; at a first channel granularity level, a probability image from each of the S-T histogram and the aggregate histograms, wherein each of the probability images comprises a plurality of pixels each indicating a probability of a corresponding pixel in the input image being a background pixel; at a second channel granularity level higher than the first channel granularity level, compact probability images from the probability images at the first channel granularity level; merging the compact probability images based on a weighted average to form a single probability image; subsampling pixels in the single probability image; determining background pixels, based on a probability threshold value, from the subsampled single probability image; and determining whether the plurality of image frames, when examined in a particular sequence, conveys a gesture by the object. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented image processing method for recognizing a gesture made by an object, the method comprising:
-
receiving, using at least one processing circuit, a plurality of image frames of a video; constructing, using at least one processing circuit, a plurality of statistical models of the plurality of image frames at a plurality of pixel granularity levels; constructing, using at least one processing circuit, a plurality of probabilistic models of an input image frame at a plurality of channel granularity levels based on the plurality of statistical models; merging at least some of the plurality of probabilistic models based on a weighted average to form a single probability image; determining background pixels, based on a probability threshold value, from the single probability image; and determining whether the plurality of image frames, when examined in a particular sequence, conveys a gesture by the object. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. An image processing system comprising at least one processing circuit configured to:
-
receive a plurality of image frames of a video; construct a plurality of statistical models of the plurality of image frames at a plurality of pixel granularity levels; construct a plurality of probabilistic models of an input image frame at a plurality of channel granularity levels based on the plurality of statistical models; merge at least some of the plurality of probabilistic models based on a weighted average to form a single probability image; determine background pixels, based on a probability threshold value, from the single probability image; and determine whether the plurality of image frames, when examined in a particular sequence, conveys a gesture by an object within the frames. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification