Object image detecting method and system

US 6,181,805 B1
Filed: 08/09/1994
Issued: 01/30/2001
Est. Priority Date: 08/11/1993
Status: Expired due to Fees

First Claim

Patent Images

1. A method for detecting a face image in an input test image as a face region by matching each of successive regions of said input test image with dictionary images I_Dnwhich are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10°

and no greater than 40°

, and n=1, 2, . . . , N, and N=L×

M, said method comprising the steps of;

(a) updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);

(b) cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;

(c) computing a degree of similarity r(n) between said matching region image and an n-th dictionary image;

(d) repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied thereby obtaining the degree of similarity r(n) between said matching region image and each of respective dictionary images of said L categories and M directions;

(e) detecting said face region in said test image by obtaining the matching position where said degree of similarity obtained by said step (d) becomes the maximum as a face position (Xmax, Ymax); and

(f) comparing said degree of similarity r(n) with a predetermined threshold value to determine, based on the comparison, whether or not the face image in said test image belongs to at least one of said categories;

wherein each said dictionary image is generated as a series of dictionary block image information consisting of q pieces of block image information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing the matching region in the image of said reference face into w pieces in a longitudinal direction and v pieces in a lateral direction, where q=v×

w, and wherein said step of computing the degree of similarity is a step of generating a series of test block image information consisting of q pieces of block information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing said matching region at the matching position (X,Y) in said test image into w pieces in the longitudinal direction and v pieces in the lateral direction and computing the degree of similarity between the series of test block image information and the n-th series of dictionary block image information as the degree of said similarity r(n).

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a dictionary image generating section, images of objects each belonging to any one of L categories are taken by a camera from predetermined M directions, a matching region of the object to be recognized is extracted from each of the object images, and dictionary images (N=L×M, n=1, 2, . . . , N) which are the representatives of combinations of direction and category are generated and stored together with the directions of the objects. When a test image is given, the degree of similarity r(n, X, Y) between an n-th dictionary image and the image region at the matching position (X, Y) in the test image is computed by a matching section. This matching process is repeated with the matching position (X, Y) and the number n of the dictionary image being varied, and the matching position (X_max, Y_max) where the degree of similarity r(n) becomes the highest is detected.

222 Citations

16 Claims

1. A method for detecting a face image in an input test image as a face region by matching each of successive regions of said input test image with dictionary images I_Dnwhich are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10°
- and no greater than 40°
  
  , and n=1, 2, . . . , N, and N=L×
  
  M, said method comprising the steps of;
  
  (a) updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
  
  (b) cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
  
  (c) computing a degree of similarity r(n) between said matching region image and an n-th dictionary image;
  
  (d) repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied thereby obtaining the degree of similarity r(n) between said matching region image and each of respective dictionary images of said L categories and M directions;
  
  (e) detecting said face region in said test image by obtaining the matching position where said degree of similarity obtained by said step (d) becomes the maximum as a face position (Xmax, Ymax); and
  
  (f) comparing said degree of similarity r(n) with a predetermined threshold value to determine, based on the comparison, whether or not the face image in said test image belongs to at least one of said categories;
  
  wherein each said dictionary image is generated as a series of dictionary block image information consisting of q pieces of block image information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing the matching region in the image of said reference face into w pieces in a longitudinal direction and v pieces in a lateral direction, where q=v×
  
  w, and wherein said step of computing the degree of similarity is a step of generating a series of test block image information consisting of q pieces of block information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing said matching region at the matching position (X,Y) in said test image into w pieces in the longitudinal direction and v pieces in the lateral direction and computing the degree of similarity between the series of test block image information and the n-th series of dictionary block image information as the degree of said similarity r(n).
- View Dependent Claims (2, 3, 6, 7, 8)
- - 2. The method according to claim 1, wherein the number of pixels composing each of said q=v×
    - w blocks is limited within a predetermined range.
  - 3. The method according to claim 1, wherein said step of generating said series of q pieces of test block image information includes:
6. The method according to claim 1, 4 or 5, further including the step of selecting one of the categories which provides a maximum of said degree of similarity r(n), thereby determining the one of the categories to which the face image in said test image belongs.
7. The method according to claim 1, 4 or 5, further including the steps of:
- generating a partial dictionary image from a predetermined region including said feature point P_fin each of said dictionary images I_Dn, where n=1, 2, . . . , N, and storing said partial dictionary image together with said number n and said position (x_f,y_f) in said dictionary image in partial dictionary image storage means;
  
  computing, as a partial matching processing, the degree of similarity r_fbetween a region around the position (X_f, Y_f) of said feature point P_fin said test image and said partial dictionary image generated from said n-th dictionary image when said test image is given; and
  
  performing said partial matching processing with the matching position (X, Y) varied, defining the position where the degree of similarity r_fbecomes the highest as the optimum position (X_fmax, Y_fmax) of said feature point P_f, and outputting said optimum position for each feature point.
8. The method according to claim 7, further including the step of determining to which one of the categories the face image in said test image belongs based on the degree of similarity r(n) between said object in said test image and the dictionary image and the degree of similarity r_fbetween said face image in said test image and the partial dictionary image.

4. A method for detecting a face image in an input test image as a face region by matching each of successive regions of said input test image with dictionary images I_Dnwhich are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10°
- and no greater than 40°
  
  , and n=1, 2, . . . , N, and N=L×
  
  M, said method comprising the steps of;
  
  (a) updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
  
  (b) cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
  
  (c) computing a degree of similarity r(n) between said matching region image and an n-th dictionary image;
  
  (d) repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied thereby obtaining the degree of similarity r(n) between said matching region image and each of respective dictionary images of said L categories and M directions;
  
  (e) detecting said face region in said test image by obtaining the matching position where said degree of similarity obtained by said step (d) becomes the maximum. as a face position (Xmax, Ymax); and
  
  (f) comparing said degree of similarity r(n) with a predetermined threshold value to determine, based on the comparison, whether or not the face image in said test image belongs to at least one of said categories;
  
  wherein said step of obtaining the degree of similarity includes transforming the intensity of each pixel so that mean and variance of the intensity of pixels within a matching region of each matching region image and dictionary image become predetermined values, respectively, and obtaining the degree of similarity of the transformed matching region image and the transformed dictionary image.

5. A method for detecting a face image in an input test image as a face region by matching each of successive regions of said input test image with dictionary images I_Dnwhich are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10°
- and no greater than 40°
  
  , and n=1, 2, . . . , N, and N=L×
  
  M, said method comprising the steps of;
  
  (a) updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
  
  (b) cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
  
  (c) computing a degree of similarity r(n) between said matching region image and an n-th dictionary image;
  
  (d) repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied thereby obtaining the degree of similarity r(n) between said matching region image and each of respective dictionary images of said L categories and M directions;
  
  (e) detecting said face region in said test image by obtaining the matching position where said degree of similarity obtained by said step (d) becomes the maximum as a face position (Xmax, Ymax);
  
  (f) comparing said degree of similarity r(n) with a predetermined threshold value to determine, based on the comparison, whether or not the face image in said test image belongs to at least one of said categories;
  
  (g) cutting out, as one of said dictionary images, an image of a predetermined matching region from each of the images of said reference faces belonging to said L categories taken from said M directions;
  
  (h) storing the position (x_f, y_f) of F feature points P_f, F being an integer of value one or greater, in each of said dictionary images corresponding to each of said dictionary images into a dictionary image storage means;
  
  (i) superimposing the dictionary image having a highest degree of similarity on said face position (X_max, Y_max) in said test image, converting the position of said feature point corresponding to said dictionary image into the position (X_f, Y_f) in said test image and outputting the converted position;
  
  (j) cutting out a region of the face image to be recognized in said test image as the matching region;
  
  (k) storing in positional relation storage means the positional relation between feature points extracted from an image obtained when said reference face is taken from a basic direction;
  
  (l) extracting feature points from matching regions of said dictionary image and said test image, respectively;
  
  (m) geometrically transforming the matching regions of an n-th dictionary image and said test image so that the positional relation between the feature points of said test image coincides with the positional relation stored in said positional relation storage means; and
  
  (n) computing the degree of similarity r(n) between said geometrically transformed matching regions of said test image and said n-th dictionary image.

9. A system for detecting a face image in an input test image as a face region comprising:
- dictionary image storage means for storing dictionary images I_Dnwhich are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10° and
  
  no greater than 40°
  
  , and n=1, 2, . . . , N, and N=L×
  
  M;
  
  position shift means for updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
  
  matching region cut-out means for cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
  
  similarity computing means for computing the degree of similarity r(n) between said matching region image and an n-th one of said dictionary images;
  
  face region detecting means for repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied, respectively, thereby obtaining the degrees of similarity r(n) between said matching region image and each of respective dictionary images of L categories and M directions and for detecting the face region in said test image by obtaining the matching position where said degree of similarity becomes maximum as a face position (X_max, Y_max); and
  
  decision means for determining whether or not the face image in said test image belongs to at least one of said categories based on said degree of similarity r(n) between said face image in said test image and the dictionary images representative of at least one of the categories, said decision means including means for comparing the degrees of said similarity r(n) with a predetermined threshold value and determining, based on the comparison, whether the face image in said test image belongs to said one of the categories;
  
  wherein each said dictionary image is stored in said dictionary image storage means as a series of dictionary block image information consisting of q pieces of block image information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing the matching region in the image of said reference face into w pieces in a longitudinal direction and v pieces in a lateral direction, where q=v×
  
  w, and wherein said similarity computing means includes means for generating a series of test block image information consisting of q pieces of block information, each composed of a representative pixel value of a corresponding one of q blocks obtained by dividing said matching region at the matching position (X,Y) in said test image into w pieces in said longitudinal direction and v pieces in said lateral direction and computing the degree of similarity between the series of test block image information and the n-th series of dictionary block image information as the degree of similarity r(n).
- View Dependent Claims (10, 12)
- - 10. The system according to claim 9, wherein there is provided means for limiting the number of pixels composing each of q=v×
    - w said blocks within a predetermined range.
  - 12. The system according to claim 9, wherein said decision means determines whether the face image in said test image belongs to either one of two categories on the basis of the degree of similarity r(n) between the face image in said test image and dictionary images representative of said two categories.

11. A system for detecting a face image in an input test image as a face region comprising:
- dictionary image storage means for storing dictionary images I_Dnwhich are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10° and
  
  no greater than 40°
  
  , and n=1, 2, . . . , N, and N=L×
  
  M;
  
  position shift means for updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
  
  matching region cut-out means for cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
  
  similarity computing means for computing the degree of similarity r(n) between said matching region image and an n-th one of said dictionary images;
  
  face region detecting means for repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied, respectively, thereby obtaining the degrees of similarity r(n) between said matching region image and each of respective dictionary images of L categories and M directions and for detecting the face region in said test image by obtaining the matching position where said degree of similarity becomes maximum as a face position (X_max, Y_max); and
  
  decision means for determining whether or not the face image in said test image belongs to at least one of said categories based on said degree of similarity r(n) between said face image in said test image and the dictionary images representative of at least one of the categories, said decision means including means for comparing the degrees of said similarity r(n) with a predetermined threshold value and determining, based on the comparison, whether the face image in said test image belongs to said one of the categories;
  
  said system further including brightness normalization means for transforming the intensity of each pixel so that mean and variance of the intensity of pixels within the matching region of each of said matching region image and said dictionary image become predetermined values, respectively, and obtaining the degree of similarity of the transformed matching region image and the transformed dictionary image.
- View Dependent Claims (16)
- - 16. The system according to claim 9, 11 or 13, further including decision means for selecting one of the categories which provides a maximum of said degree of similarity r(n) between said face image in said test image and the dictionary image, thereby determining the one of the categories to which the face image in said test image belongs.

13. A system for detecting a face image in an input test image as a face region comprising:
- dictionary image storage means for storing dictionary images I_Dnwhich are produced from images obtained by taking images of reference faces belonging to L categories from predetermined M directions, where L is an integer equal to one or greater, M is an integer equal to two or greater and is so selected that an angle between two adjacent directions from which images of said references faces are taken is equal to or greater than 10° and
  
  no greater than 40°
  
  , and n=1, 2, . . . , N, and N=L×
  
  M;
  
  position shift means for updating a matching position (X,Y) of said input test image and outputting said updated matching position (X,Y);
  
  matching region cut-out means for cutting out, as a matching region image, an image of a region of a predetermined size on the basis of said matching position (X,Y) in said input test image;
  
  similarity computing means for computing the degree of similarity r(n) between said matching region image and an n-th one of said dictionary images;
  
  face region detecting means for repeating the computation of said degree of similarity r(n) with said matching position (X,Y) and said n varied, respectively, thereby obtaining the degrees of similarity r(n) between said matching region image and each of respective dictionary images of L categories and M directions and for detecting the face region in said test image by obtaining the matching position where said degree of similarity becomes maximum as a face position (X_max, Y_max); and
  
  decision means for determining whether or not the face image in said test image belongs to at least one of said categories based on said degree of similarity r(n) between said face image in said test image and the dictionary images representative of at least one of the categories, said decision means including means for comparing the degrees of said similarity r(n) with a predetermined threshold value and determining, based on the comparison, whether the face image in said test image belongs to said one of the categories;
  
  wherein said decision means determines whether the face image in said test image belongs to either one of two categories on the basis of the degree of similarity r(n) between the face image in said test image and dictionary images representative of said two categories; and
  
  wherein said dictionary image storage means stores therein the position (x_f, y_f) of F feature point P_fin each of said dictionary images in correspondence therewith, F being an integer equal to or greater than one, said system further including;
  
  feature point position detecting means for superimposing the dictionary image having the highest degree of similarity on said face position (X_max, Y_max) in said test image, converting the position of said feature point corresponding to said dictionary image into the position (X_f, Y_f) in said test image and outputting the converted position;
  
  matching region cut-out means for cutting out a region of the face image to be recognized in said test image as the matching region;
  
  positional relation storage means for storing therein the positional relation between feature points extracted from an image obtained when said reference object is taken from a basic direction;
  
  feature point detecting means for extracting feature points from the matching regions of said dictionary image and said test image, respectively; and
  
  geometrical normalization means for geometrically transforming the matching regions of an n-th dictionary image and said test image so that the positional relation between the feature points of said test image coincides with the positional relation stored in said positional relation storage means.
- View Dependent Claims (14, 15)
- - 14. The system according to claim 13, further including:
15. The system according to claim 14, further including decision means for determining to which one of the categories the face image in said test image belongs on the basis of said degree of similarity r(n) between said face image in said test image and the dictionary image and said degree of similarity r_fbetween said face image in said test image and the partial dictionary image.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nippon Telegraph and Telephone Corporation
Original Assignee
Nippon Telegraph and Telephone Corporation
Inventors
Tomono, Akira, Koike, Hideki, Shimada, Satoshi, Ishii, Kenichiro, Iso, Toshiki
Primary Examiner(s)
Johns, Andrew W.

Application Number

US08/288,194
Time in Patent Office

2,366 Days
Field of Search

382/117, 382/118, 382/216
US Class Current

382/118
CPC Class Codes

G06V 40/172 Classification, e.g. identi...

Object image detecting method and system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

222 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Object image detecting method and system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

222 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links