System and method for detecting a human face in uncontrolled environments
First Claim
1. A system for detecting a face within a video image, comprising:
- (a) a video camera;
(b) means for storing an image from the video camera; and
(c) processing means coupled to the video camera and the storing means for performing the steps of;
(i) storing a background image from the video camera in the storing means;
(ii) storing a video image from the video camera in the storing means;
(iii) subtracting the background image from video image stored in the storing means;
(iv) identifying a region within the video image that surpasses a selected high intensity threshold;
(v) comparing the identified region to at least one model of a face;
(vi) selecting one of the at least one model that best describes the identified region;
(vii) generating parameters associated with an ellipse that corresponds to the identified region, responsive to step (vi);
(viii) identifying sub-regions within the identified region that are below a selected low-intensity threshold;
(ix) filtering out sub-regions below a selected small size or above a selected large size;
(x) comparing the remaining sub-regions to at least one anthropological model defining eyes; and
(xi) generating parameters corresponding to the remaining sub-regions, responsive to step (x).
6 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides for the detection of human heads, faces and eyes in real-time and in uncontrolled environments. The present invention may be implemented with commercially available components, such as a standard video camera and a frame grabber, on a personal computer (PC) platform. The approach used by the present invention is based on a probabilistic framework that uses a deformable template model to describe the human face. The present invention works both with simple head-and-shoulder video sequences, as well as with complex video scenes with multiple people and random motion. The present invention is able to locate the eyes from different head poses (rotations in image plane as well as in depth). The information provided by the location of the eyes may be used to extract faces from a frontal pose in a video sequence. The extracted frontal frames can be passed to recognition and classification systems (or the like) for further processing.
-
Citations
8 Claims
-
1. A system for detecting a face within a video image, comprising:
-
(a) a video camera;
(b) means for storing an image from the video camera; and
(c) processing means coupled to the video camera and the storing means for performing the steps of;
(i) storing a background image from the video camera in the storing means;
(ii) storing a video image from the video camera in the storing means;
(iii) subtracting the background image from video image stored in the storing means;
(iv) identifying a region within the video image that surpasses a selected high intensity threshold;
(v) comparing the identified region to at least one model of a face;
(vi) selecting one of the at least one model that best describes the identified region;
(vii) generating parameters associated with an ellipse that corresponds to the identified region, responsive to step (vi);
(viii) identifying sub-regions within the identified region that are below a selected low-intensity threshold;
(ix) filtering out sub-regions below a selected small size or above a selected large size;
(x) comparing the remaining sub-regions to at least one anthropological model defining eyes; and
(xi) generating parameters corresponding to the remaining sub-regions, responsive to step (x). - View Dependent Claims (2, 3, 7)
(1) computing the vertical skeleton of the video image;
(2) estimating an initial ellipse centroid from the highest and lowest point of the vertical skeleton;
(3) measuring the width between the left and the right edges of the video image;
(4) measuring the length between the highest point of the vertical skeleton and the y coordinate of the ellipse centroid at the current iteration;
(5) computing the error e(k) associated with the currently determined ellipse parameters according to the expression;
-
-
4. A process for detecting a face within a video image, wherein the video image is generated by a video camera and stored with a storage device, comprising the steps of:
-
(a) storing a background image from the video camera in the storage device;
(b) storing a video image from the video camera in the storage device;
(c) subtracting the background image from video image stored in the storage device;
(d) identifying a region within the video image that surpasses a selected high intensity threshold;
(e) comparing the identified region to at least one model of a face;
(f) selecting one of the at least one model that best describes the identified region; and
(g) generating parameters associated with an ellipse that corresponds to the identified region, responsive to step (f);
(h) identifying sub-regions within the identified region that are below a selected low-intensity threshold;
(i) filtering out sub-regions below a selected small size or above a selected large size;
(j) comparing the remaining sub-regions to at least one anthropological model defining eyes; and
(k) generating parameters corresponding to the remaining sub-regions, responsive to step (j). - View Dependent Claims (5, 6, 8)
(1) computing the vertical skeleton of the video image;
(2) estimating an initial ellipse centroid from the highest and lowest point of the vertical skeleton;
(3) measuring the width between the left and the right edges of the video image;
(4) measuring the length between the highest point of the vertical skeleton and the y coordinate of the ellipse centroid at the current iteration;
(5) computing the error e(k) associated with the currently determined ellipse parameters according to the expression;
-
Specification