Object image detecting method and system
First Claim
1. A method for detecting a face image in an input test image as a face region by matching each of successive regions of said input test image with dictionary images IDn which are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10°
- and no greater than 40°
, and n=1, 2, . . . , N, and N=L×
M, said method comprising the steps of;
(a) updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
(b) cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
(c) computing a degree of similarity r(n) between said matching region image and an n-th dictionary image;
(d) repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied thereby obtaining the degree of similarity r(n) between said matching region image and each of respective dictionary images of said L categories and M directions;
(e) detecting said face region in said test image by obtaining the matching position where said degree of similarity obtained by said step (d) becomes the maximum as a face position (Xmax, Ymax); and
(f) comparing said degree of similarity r(n) with a predetermined threshold value to determine, based on the comparison, whether or not the face image in said test image belongs to at least one of said categories;
wherein each said dictionary image is generated as a series of dictionary block image information consisting of q pieces of block image information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing the matching region in the image of said reference face into w pieces in a longitudinal direction and v pieces in a lateral direction, where q=v×
w, and wherein said step of computing the degree of similarity is a step of generating a series of test block image information consisting of q pieces of block information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing said matching region at the matching position (X,Y) in said test image into w pieces in the longitudinal direction and v pieces in the lateral direction and computing the degree of similarity between the series of test block image information and the n-th series of dictionary block image information as the degree of said similarity r(n).
1 Assignment
0 Petitions
Accused Products
Abstract
In a dictionary image generating section, images of objects each belonging to any one of L categories are taken by a camera from predetermined M directions, a matching region of the object to be recognized is extracted from each of the object images, and dictionary images (N=L×M, n=1, 2, . . . , N) which are the representatives of combinations of direction and category are generated and stored together with the directions of the objects. When a test image is given, the degree of similarity r(n, X, Y) between an n-th dictionary image and the image region at the matching position (X, Y) in the test image is computed by a matching section. This matching process is repeated with the matching position (X, Y) and the number n of the dictionary image being varied, and the matching position (Xmax, Ymax) where the degree of similarity r(n) becomes the highest is detected.
222 Citations
16 Claims
-
1. A method for detecting a face image in an input test image as a face region by matching each of successive regions of said input test image with dictionary images IDn which are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10°
- and no greater than 40°
, and n=1, 2, . . . , N, and N=L×
M, said method comprising the steps of;(a) updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
(b) cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
(c) computing a degree of similarity r(n) between said matching region image and an n-th dictionary image;
(d) repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied thereby obtaining the degree of similarity r(n) between said matching region image and each of respective dictionary images of said L categories and M directions;
(e) detecting said face region in said test image by obtaining the matching position where said degree of similarity obtained by said step (d) becomes the maximum as a face position (Xmax, Ymax); and
(f) comparing said degree of similarity r(n) with a predetermined threshold value to determine, based on the comparison, whether or not the face image in said test image belongs to at least one of said categories;
wherein each said dictionary image is generated as a series of dictionary block image information consisting of q pieces of block image information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing the matching region in the image of said reference face into w pieces in a longitudinal direction and v pieces in a lateral direction, where q=v×
w, and wherein said step of computing the degree of similarity is a step of generating a series of test block image information consisting of q pieces of block information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing said matching region at the matching position (X,Y) in said test image into w pieces in the longitudinal direction and v pieces in the lateral direction and computing the degree of similarity between the series of test block image information and the n-th series of dictionary block image information as the degree of said similarity r(n).- View Dependent Claims (2, 3, 6, 7, 8)
a step of holding, as a background image, an input image taken a fixed time before said input test image is taken in the same direction;
a step of producing a difference image between a present input test image and said background image;
a step of obtaining a size of a face region based on said difference image;
a step of determining a range of block size based on said size of the face region; and
a step of repeating the processing for computing the degree of similarity for each block size varied a fixed width step by step from an initial value within said range of the block size.
- and no greater than 40°
-
6. The method according to claim 1, 4 or 5, further including the step of selecting one of the categories which provides a maximum of said degree of similarity r(n), thereby determining the one of the categories to which the face image in said test image belongs.
-
7. The method according to claim 1, 4 or 5, further including the steps of:
-
generating a partial dictionary image from a predetermined region including said feature point Pf in each of said dictionary images IDn, where n=1, 2, . . . , N, and storing said partial dictionary image together with said number n and said position (xf,yf) in said dictionary image in partial dictionary image storage means;
computing, as a partial matching processing, the degree of similarity rf between a region around the position (Xf, Yf) of said feature point Pf in said test image and said partial dictionary image generated from said n-th dictionary image when said test image is given; and
performing said partial matching processing with the matching position (X, Y) varied, defining the position where the degree of similarity rf becomes the highest as the optimum position (Xfmax, Yfmax) of said feature point Pf, and outputting said optimum position for each feature point.
-
-
8. The method according to claim 7, further including the step of determining to which one of the categories the face image in said test image belongs based on the degree of similarity r(n) between said object in said test image and the dictionary image and the degree of similarity rf between said face image in said test image and the partial dictionary image.
-
4. A method for detecting a face image in an input test image as a face region by matching each of successive regions of said input test image with dictionary images IDn which are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10°
- and no greater than 40°
, and n=1, 2, . . . , N, and N=L×
M, said method comprising the steps of;(a) updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
(b) cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
(c) computing a degree of similarity r(n) between said matching region image and an n-th dictionary image;
(d) repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied thereby obtaining the degree of similarity r(n) between said matching region image and each of respective dictionary images of said L categories and M directions;
(e) detecting said face region in said test image by obtaining the matching position where said degree of similarity obtained by said step (d) becomes the maximum. as a face position (Xmax, Ymax); and
(f) comparing said degree of similarity r(n) with a predetermined threshold value to determine, based on the comparison, whether or not the face image in said test image belongs to at least one of said categories;
wherein said step of obtaining the degree of similarity includes transforming the intensity of each pixel so that mean and variance of the intensity of pixels within a matching region of each matching region image and dictionary image become predetermined values, respectively, and obtaining the degree of similarity of the transformed matching region image and the transformed dictionary image.
- and no greater than 40°
-
5. A method for detecting a face image in an input test image as a face region by matching each of successive regions of said input test image with dictionary images IDn which are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10°
- and no greater than 40°
, and n=1, 2, . . . , N, and N=L×
M, said method comprising the steps of;(a) updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
(b) cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
(c) computing a degree of similarity r(n) between said matching region image and an n-th dictionary image;
(d) repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied thereby obtaining the degree of similarity r(n) between said matching region image and each of respective dictionary images of said L categories and M directions;
(e) detecting said face region in said test image by obtaining the matching position where said degree of similarity obtained by said step (d) becomes the maximum as a face position (Xmax, Ymax);
(f) comparing said degree of similarity r(n) with a predetermined threshold value to determine, based on the comparison, whether or not the face image in said test image belongs to at least one of said categories;
(g) cutting out, as one of said dictionary images, an image of a predetermined matching region from each of the images of said reference faces belonging to said L categories taken from said M directions;
(h) storing the position (xf, yf) of F feature points Pf, F being an integer of value one or greater, in each of said dictionary images corresponding to each of said dictionary images into a dictionary image storage means;
(i) superimposing the dictionary image having a highest degree of similarity on said face position (Xmax, Ymax) in said test image, converting the position of said feature point corresponding to said dictionary image into the position (Xf, Yf) in said test image and outputting the converted position;
(j) cutting out a region of the face image to be recognized in said test image as the matching region;
(k) storing in positional relation storage means the positional relation between feature points extracted from an image obtained when said reference face is taken from a basic direction;
(l) extracting feature points from matching regions of said dictionary image and said test image, respectively;
(m) geometrically transforming the matching regions of an n-th dictionary image and said test image so that the positional relation between the feature points of said test image coincides with the positional relation stored in said positional relation storage means; and
(n) computing the degree of similarity r(n) between said geometrically transformed matching regions of said test image and said n-th dictionary image.
- and no greater than 40°
-
9. A system for detecting a face image in an input test image as a face region comprising:
-
dictionary image storage means for storing dictionary images IDn which are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10° and
no greater than 40°
, and n=1, 2, . . . , N, and N=L×
M;
position shift means for updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
matching region cut-out means for cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
similarity computing means for computing the degree of similarity r(n) between said matching region image and an n-th one of said dictionary images;
face region detecting means for repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied, respectively, thereby obtaining the degrees of similarity r(n) between said matching region image and each of respective dictionary images of L categories and M directions and for detecting the face region in said test image by obtaining the matching position where said degree of similarity becomes maximum as a face position (Xmax, Ymax); and
decision means for determining whether or not the face image in said test image belongs to at least one of said categories based on said degree of similarity r(n) between said face image in said test image and the dictionary images representative of at least one of the categories, said decision means including means for comparing the degrees of said similarity r(n) with a predetermined threshold value and determining, based on the comparison, whether the face image in said test image belongs to said one of the categories;
wherein each said dictionary image is stored in said dictionary image storage means as a series of dictionary block image information consisting of q pieces of block image information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing the matching region in the image of said reference face into w pieces in a longitudinal direction and v pieces in a lateral direction, where q=v×
w, and wherein said similarity computing means includes means for generating a series of test block image information consisting of q pieces of block information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing said matching region at the matching position (X,Y) in said test image into w pieces in said longitudinal direction and v pieces in said lateral direction and computing the degree of similarity between the series of test block image information and the n-th series of dictionary block image information as the degree of similarity r(n).- View Dependent Claims (10, 12)
-
-
11. A system for detecting a face image in an input test image as a face region comprising:
-
dictionary image storage means for storing dictionary images IDn which are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10° and
no greater than 40°
, and n=1, 2, . . . , N, and N=L×
M;
position shift means for updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
matching region cut-out means for cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
similarity computing means for computing the degree of similarity r(n) between said matching region image and an n-th one of said dictionary images;
face region detecting means for repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied, respectively, thereby obtaining the degrees of similarity r(n) between said matching region image and each of respective dictionary images of L categories and M directions and for detecting the face region in said test image by obtaining the matching position where said degree of similarity becomes maximum as a face position (Xmax, Ymax); and
decision means for determining whether or not the face image in said test image belongs to at least one of said categories based on said degree of similarity r(n) between said face image in said test image and the dictionary images representative of at least one of the categories, said decision means including means for comparing the degrees of said similarity r(n) with a predetermined threshold value and determining, based on the comparison, whether the face image in said test image belongs to said one of the categories;
said system further including brightness normalization means for transforming the intensity of each pixel so that mean and variance of the intensity of pixels within the matching region of each of said matching region image and said dictionary image become predetermined values, respectively, and obtaining the degree of similarity of the transformed matching region image and the transformed dictionary image. - View Dependent Claims (16)
-
-
13. A system for detecting a face image in an input test image as a face region comprising:
-
dictionary image storage means for storing dictionary images IDn which are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10° and
no greater than 40°
, and n=1, 2, . . . , N, and N=L×
M;
position shift means for updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
matching region cut-out means for cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
similarity computing means for computing the degree of similarity r(n) between said matching region image and an n-th one of said dictionary images;
face region detecting means for repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied, respectively, thereby obtaining the degrees of similarity r(n) between said matching region image and each of respective dictionary images of L categories and M directions and for detecting the face region in said test image by obtaining the matching position where said degree of similarity becomes maximum as a face position (Xmax, Ymax); and
decision means for determining whether or not the face image in said test image belongs to at least one of said categories based on said degree of similarity r(n) between said face image in said test image and the dictionary images representative of at least one of the categories, said decision means including means for comparing the degrees of said similarity r(n) with a predetermined threshold value and determining, based on the comparison, whether the face image in said test image belongs to said one of the categories;
wherein said decision means determines whether the face image in said test image belongs to either one of two categories on the basis of the degree of similarity r(n) between the face image in said test image and dictionary images representative of said two categories; and
wherein said dictionary image storage means stores therein the position (xf, yf) of F feature point Pf in each of said dictionary images in correspondence therewith, F being an integer equal to or greater than one, said system further including;
feature point position detecting means for superimposing the dictionary image having the highest degree of similarity on said face position (Xmax, Ymax) in said test image, converting the position of said feature point corresponding to said dictionary image into the position (Xf, Yf) in said test image and outputting the converted position;
matching region cut-out means for cutting out a region of the face image to be recognized in said test image as the matching region;
positional relation storage means for storing therein the positional relation between feature points extracted from an image obtained when said reference object is taken from a basic direction;
feature point detecting means for extracting feature points from the matching regions of said dictionary image and said test image, respectively; and
geometrical normalization means for geometrically transforming the matching regions of an n-th dictionary image and said test image so that the positional relation between the feature points of said test image coincides with the positional relation stored in said positional relation storage means. - View Dependent Claims (14, 15)
partial dictionary means for generating a partial dictionary image from a predetermined region including said feature point Pf in each of said dictionary images IDn, where n=1, 2, . . . , N, and storing therein said partial dictionary image together with said number n and said position (xf, yf) in partial dictionary image storage means;
partial matching means for computing, as partial matching processing, the degree of similarity rf between a region around the position (Xf, Yf) of said feature point Pf in said test image and said partial dictionary image generated from the n-th dictionary image when said test image is given; and
high precision feature point position determining means for performing said partial matching processing with the matching position (X, Y) varied, defining the position where the degree of similarity rf becomes the highest as the optimum position (Xfmax, Yfmax) of said feature point Pf, and outputting said optimum position for each feature point.
-
-
15. The system according to claim 14, further including decision means for determining to which one of the categories the face image in said test image belongs on the basis of said degree of similarity r(n) between said face image in said test image and the dictionary image and said degree of similarity rf between said face image in said test image and the partial dictionary image.
Specification