Face detection and tracking in a video sequence

US 7,146,028 B2
Filed: 04/10/2003
Issued: 12/05/2006
Est. Priority Date: 04/12/2002
Status: Expired due to Fees

First Claim

Patent Images

1. A method of detecting and tracking human faces across a sequence of video frames, said method comprising the steps of:

(a) forming a 3D pixel data block from said sequence of video frames;

(b) segmenting said 3D data block into a set of 3D segments using 3D spatiotemporal segmentation;

(c) forming 2D segments from an intersection of said 3D segments with a view plane, each 2D segment being associated with one 3D segment;

(d) in at least one of said 2D segments, extracting features and grouping said features into one or more groups of features;

(e) for each group of features, computing a probability that said group of features represents human facial features based on the similarity of the geometry of said group of features with the geometry of a human face model;

(f) matching at least one group of features with a group of features in a previous 2D segment and computing an accumulated probability that said group of features represents human facial features using probabilities of matched groups of features;

(g) classifying each 2D segment as a face segment or a non-face segment based on said accumulated probability of at least one group of features in each of said 2D segments; and

(h) tracking said human faces by finding an intersection of 3D segments associated with said face segments with at least subsequent view planes.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method (100) and apparatus (700) are disclosed for detecting and tracking human faces across a sequence of video frames. Spatiotemporal segmentation is used to segment (115) the sequence of video frames into 3D segments. 2D segments are then formed from the 3D segments, with each 2D segment being associated with one 3D segment. Features are extracted (140) from the 2D segments and grouped into groups of features. For each group of features, a probability that the group of features includes human facial features is calculated (145) based on the similarity of the geometry of the group of features with the geometry of a human face model. Each group of features is also matched with a group of features in a previous 2D segment and an accumulated probability that said group of features includes human facial features is calculated (150). Each 2D segment is classified (155) as a face segment or a non-face segment based on the accumulated probability. Human faces are then tracked by finding 2D segments in subsequent frames associated with 3D segments associated with face segments.

Citations

21 Claims

1. A method of detecting and tracking human faces across a sequence of video frames, said method comprising the steps of:
- (a) forming a 3D pixel data block from said sequence of video frames;
  
  (b) segmenting said 3D data block into a set of 3D segments using 3D spatiotemporal segmentation;
  
  (c) forming 2D segments from an intersection of said 3D segments with a view plane, each 2D segment being associated with one 3D segment;
  
  (d) in at least one of said 2D segments, extracting features and grouping said features into one or more groups of features;
  
  (e) for each group of features, computing a probability that said group of features represents human facial features based on the similarity of the geometry of said group of features with the geometry of a human face model;
  
  (f) matching at least one group of features with a group of features in a previous 2D segment and computing an accumulated probability that said group of features represents human facial features using probabilities of matched groups of features;
  
  (g) classifying each 2D segment as a face segment or a non-face segment based on said accumulated probability of at least one group of features in each of said 2D segments; and
  
  (h) tracking said human faces by finding an intersection of 3D segments associated with said face segments with at least subsequent view planes.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. A method according to claim 1, wherein said features are regions in said 2D segment which are darker than the rest of said 2D segment.
  - 3. A method according to claim 1, wherein said features are regions in said 2D segment having edges.
  - 4. A method according to claim 1, wherein said group of features forms a triangle.
  - 5. A method according to claim 1, wherein said method comprises the further steps of:
    - determining, for each said 2D segment, a first measure of said 2D segment having a colour of human skin; and
      
      eliminating 2D segments having said first measure below a first predetermined threshold from further processing.
  - 6. A method according to claim 1, wherein said method comprises the further step of:
    - eliminating 2D segments having a form that is non-elliptical from further processing.
  - 7. A method according to claim 1, wherein said method comprises the further steps of:
    - determining movements of said 2D segments from positions of previous 2D segments associated with the same 3D segments; and
      
      eliminating 2D segments from further processing where said movement is below a second predetermined threshold.

8. An apparatus for detecting and tracking human faces across a sequence of video frames, said apparatus comprising:
- means for forming a 3D pixel data block from said sequence of video frames;
  
  means for segmenting said 3D data block into a set of 3D segments using 3D spatiotemporal segmentation;
  
  means for forming 2D segments from an intersection of said 3D segments with a view plane, each 2D segment being associated with one 3D segment;
  
  in at least one of said 2D segments, means for extracting features and grouping said feature'"'"'s into one or more groups of features;
  
  for each group of features, means for computing a probability that said group of features represents human facial features based on the similarity of the geometry of said group of features with the geometry of a human face model;
  
  means for matching at least one group of features with a group of features in a previous 2D segment and computing an accumulated probability that said group of features represents human facial features using probabilities of matched groups of features;
  
  means for classifying each 2D segment as a face segment or a non-face segment based on said accumulated probability of at least one group of features in each of said 2D segments; and
  
  means for tracking said human faces by finding an intersection of 3D segments associated with said face segments with at least subsequent view planes.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. An apparatus according to claim 8, wherein said features are regions in said 2D segment which are darker than the rest of said 2D segment.
  - 10. An apparatus according to claim 8, wherein said features are regions in said 2D segment having edges.
  - 11. An apparatus according to claim 8, wherein said group of features forms a triangle.
  - 12. An apparatus according to claim 8, wherein said apparatus further comprises:
    - means for determining, for each said 2D segment, a first measure of said 2D segment having a colour of human skin; and
      
      means for eliminating 2D segments having said first measure below a first predetermined threshold from further processing.
  - 13. An apparatus according to claim 8, wherein said apparatus further comprises:
    - means for eliminating 2D segments having a form that is non-elliptical from further processing.
  - 14. An apparatus according to claim 8, wherein said apparatus further comprises:
    - means for determining movements of said 2D segments from positions of previous 2D segments associated with the same 3D segments; and
      
      means for eliminating 2D segments from further processing where said movement is below a second predetermined threshold.

15. A computer-executable program stored on a computer readable storage medium, the program for detecting and tracking human faces across a sequence of video frames, said program comprising:
- code for forming a 3D pixel data block from said sequence of video frames;
  
  code for segmenting said 3D data block into a set of 3D segments using 3D spatiotemporal segmentation;
  
  code for forming 2D segments from an intersection of said 3D segments with a view plane, each 2D segment being associated with one 3D segment;
  
  in at least one of said 2D segments, code for extracting features and grouping said features into one or more groups of features;
  
  for each group of features, code for computing a probability that said group of features represents human facial features based on the similarity of the geometry of said group of features with the geometry of a human face model;
  
  code for matching at least one group of features with a group of features in a previous 2D segment and computing an accumulated probability that said group of features represents human facial features using probabilities of matched groups of features;
  
  code for classifying each 2D segment as a face segment or a non-face segment based on said accumulated probability of at least one group of features in each of said 2D segments; and
  
  code for tracking said human faces by finding an intersection of 3D segments associated with said face segments with at least subsequent view planes.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. A program according to claim 15, wherein said features are regions in said 2D segment which are darker than the rest of said 2D segment.
  - 17. A program according to claim 15, wherein said features are regions in said 2D segment having edges.
  - 18. A program according to claim 15, wherein said group of features forms a triangle.
  - 19. A program according to claim 15, wherein said program further comprises:
    - code for determining for each said 2D segment, a first measure of said 2D segment having a colour of human skin; and
      
      code for eliminating 2D segments having said first measure below a first predetermined threshold from further processing.
  - 20. A program according to claim 15, wherein said program further comprises:
    - code for eliminating 2D segments having a form that is non-elliptical from further processing.
  - 21. A program according to claim 15, wherein said program further comprises:
    - code for determining movements of said 2D segments from positions of previous 2D segments associated with the same 3D segment; and
      
      code for eliminating 2D segments from further processing where said movement is below a second predetermined threshold.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Original Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Inventors
Lestideau, Fabrice
Primary Examiner(s)
DESIRE, GREGORY M

Application Number

US10/410,350
Publication Number

US 20040017933A1
Time in Patent Office

1,335 Days
Field of Search

382/103, 382/115, 382/118, 382/154, 382/285, 382/159
US Class Current

382/118
CPC Class Codes

G06T 7/20 Analysis of motion motion e...

G06V 40/161 Detection; Localisation; No...

Face detection and tracking in a video sequence

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Face detection and tracking in a video sequence

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links