Method for driving virtual facial expressions by automatically detecting facial expressions of a face image

US 7,751,599 B2
Filed: 08/09/2006
Issued: 07/06/2010
Est. Priority Date: 08/09/2006
Status: Active Grant

First Claim

Patent Images

1. A method for driving virtual facial expressions by automatically detecting facial expressions of a face image, which is applied to a digital image capturing device, comprising:

learning and correcting a plurality of front face image samples to obtain an average position of each key point of a face, eyes, a nose and a mouth of each of said front face image samples on a standard image for imitating a virtual front face, wherein said standard image is a greyscale figure having a fixed size and using a front face as a center without an inclination;

using a Gabor wavelet algorithm to sample a series of Gabor Jets from said key points of said front face image samples to form a Gabor Jet Bunch;

automatically detecting positions of a face, eyes, a nose and a mouth in a target image captured by said digital image capturing device, and converting said positions onto said standard image;

performing a fitting or regression calculation for said positions of eyes, nose and mouth of said target image and said average positions corresponding thereto to obtain initial positions of key points of eyes, nose and mouth of said target image on said standard image;

calculating exact positions of said key points of said target image on said standard image by using a point within a neighborhood of each of said initial positions as a selecting point, comparing a Gabor Jet of each of said initial positions with each of said Gabor Jets in said Gabor Jet Bunch, selecting said Gabor Jet in said Gabor Jet Bunch having a highest similarity with said selecting point as said exact position of said key point on said standard image corresponding to said key point of said target image, inversely aligning said exact positions of said key points on said standard image onto said target image, and labeling said exact positions as exact positions of said key points on said target image; and

automatically tracking positions of key points of other target image captured by said digital image capturing device later, and correcting said exact position of said key point on said standard image through obtaining a motion parameter (dx, dy) between said key points corresponding to said target image and said other target image by using an optical flow technique, calculating an error ε

(dx, dy) of said motion parameter (dx, dy) according to the following formula, wherein I (x,y) represents gray scales of said target image, J(x,y) represents gray scales of said other target image, and x, y represent coordinates of each of said key points of said target image or said other target image;

ε

(d)=ε

(dx,dy)=Σ

Σ

(I(x,y)−

J(x+dx,y+dy))²,to find dx, dy that minimize said error ε

(dx, dy) and obtain an estimated position of said key point of said other target image based on said position of said key point of said target image corresponding thereto, and performing a fitting or regression calculation for said estimated positions and said average positions to calculate estimated positions of key points of said other target image on said standard image.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for driving virtual facial expressions by automatically detecting facial expressions of a face image is applied to a digital image capturing device. The method includes the steps of detecting a face image captured by the image capturing device and images of a plurality of facial features with different facial expressions to obtain a key point position of each facial feature on the face image; mapping the key point positions to a virtual face as the key point positions of corresponding facial features on the virtual face; dynamically tracking the key point of each facial feature on the face image; estimating the key point positions of each facial feature of the current face image according to the key point positions of each facial feature on a previous face image; and correcting the key point positions of the corresponding facial features on the virtual face.

44 Citations

View as Search Results

17 Claims

1. A method for driving virtual facial expressions by automatically detecting facial expressions of a face image, which is applied to a digital image capturing device, comprising:
- learning and correcting a plurality of front face image samples to obtain an average position of each key point of a face, eyes, a nose and a mouth of each of said front face image samples on a standard image for imitating a virtual front face, wherein said standard image is a greyscale figure having a fixed size and using a front face as a center without an inclination;
  
  using a Gabor wavelet algorithm to sample a series of Gabor Jets from said key points of said front face image samples to form a Gabor Jet Bunch;
  
  automatically detecting positions of a face, eyes, a nose and a mouth in a target image captured by said digital image capturing device, and converting said positions onto said standard image;
  
  performing a fitting or regression calculation for said positions of eyes, nose and mouth of said target image and said average positions corresponding thereto to obtain initial positions of key points of eyes, nose and mouth of said target image on said standard image;
  
  calculating exact positions of said key points of said target image on said standard image by using a point within a neighborhood of each of said initial positions as a selecting point, comparing a Gabor Jet of each of said initial positions with each of said Gabor Jets in said Gabor Jet Bunch, selecting said Gabor Jet in said Gabor Jet Bunch having a highest similarity with said selecting point as said exact position of said key point on said standard image corresponding to said key point of said target image, inversely aligning said exact positions of said key points on said standard image onto said target image, and labeling said exact positions as exact positions of said key points on said target image; and
  
  automatically tracking positions of key points of other target image captured by said digital image capturing device later, and correcting said exact position of said key point on said standard image through obtaining a motion parameter (dx, dy) between said key points corresponding to said target image and said other target image by using an optical flow technique, calculating an error ε
  
  (dx, dy) of said motion parameter (dx, dy) according to the following formula, wherein I (x,y) represents gray scales of said target image, J(x,y) represents gray scales of said other target image, and x, y represent coordinates of each of said key points of said target image or said other target image;
  
  ε
  
  (d)=ε
  
  (dx,dy)=Σ
  
  Σ
  
  (I(x,y)−
  
  J(x+dx,y+dy))²,to find dx, dy that minimize said error ε
  
  (dx, dy) and obtain an estimated position of said key point of said other target image based on said position of said key point of said target image corresponding thereto, and performing a fitting or regression calculation for said estimated positions and said average positions to calculate estimated positions of key points of said other target image on said standard image.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1 further comprises:
    - calculating exact positions of said key points of said other target image on said standard image by using a point within a neighborhood of each of said estimated positions of said key points of said other target image on said standard image as another selecting point, comparing a Gabor Jet of each of said estimated positions of said key points of said other target image on said standard image with each of said Gabor Jets in said Gabor Jet Bunch, selecting said Gabor Jet in said Gabor Jet Bunch having a highest similarity with said another selecting point as said exact position of said key point on said standard image corresponding to said other target image.
  - 3. The method of claim 2 further comprises:
    - comparing similarities of optical flows between said target image and said other target image, and examining an average similarity of said Gabor Jets of said key points on said target image and said other target image to instantly measure a tracking accuracy; and
      
      determining whether or not said tracking accuracy matches with a predetermined standard;
      
      if yes, then said tracking is determined as successful, or else said tracking is determined as failed, and repeating the previous steps again.
  - 4. The method of claim 3 further comprises:
    - obtaining a key point distribution model through learning from said front face image samples, wherein said key point distribution model is used to represent a distribution of statistic and position of each of said key points of said front face image samples on said standard image; and
      
      determining whether or not said positions of said key points on said standard image deviate from another average position calculated from said statistics and positions of said key point distribution model;
      
      if yes, then said position of said key point on said standard image is pulled back to an accurate position close to said another average position.
  - 5. The method of claim 4 further comprises:
    - defining 28 points as said key points of said target image or said other target image, wherein said key points includes four points situated at a highest position of an upper edge of each of two eyebrows of said eyes and tips of internal sides of said upper edges;
      
      eight points situated at corners of said eyes and middle positions of eyelids of said eyes;
      
      four points situated at both sides of wing of said nose, a position of a bridge of said nose and a position of a tip of said nose; and
      
      12 points situated equidistantly along external edges of a lip.
  - 6. The method of claim 2 further comprises:
    - obtaining a key point distribution model through learning from said front face image samples, wherein said key point distribution model is used to represent a distribution of statistic and position of each of said key points of said front face image samples on said standard image; and
      
      determining whether or not said positions of said key points on said standard image deviate from another average position calculated from said statistics and positions of said key point distribution model;
      
      if yes, then said position of said key point on said standard image is pulled back to an accurate position close to said another average position.
  - 7. The method of claim 6 further comprises:
    - defining 28 points as said key points of said target image or said other target image, wherein said key points includes four points situated at a highest position of an upper edge of each of two eyebrows of said eyes and tips of internal sides of said upper edges;
      
      eight points situated at corners of said eyes and middle positions of eyelids of said eyes;
      
      four points situated at both sides of the wing of said nose, a position of a bridge of said nose and a position of a tip of said nose; and
      
      12 points situated equidistantly along external edges of a lip of said mouth.
  - 8. The method of claim 1 further comprises:
    - obtaining a key point distribution model through learning from said front face image samples, wherein said key point distribution model is used to represent a distribution of statistic and position of each of said key points of said front face image samples on said standard image; and
      
      determining whether or not said positions of said key points on said standard image deviate from another average position calculated from said statistics and positions of said key point distribution model;
      
      if yes, then said position of said key point on said standard image is pulled back to an accurate position close to said another average position.
  - 9. The method of claim 8 further comprises:
    - defining 28 points as said key points of said target image or said other target image, wherein said key points includes four points situated at a highest position of an upper edge of each of two eyebrows of said eyes and tips of internal sides of said upper edges;
      
      eight points situated at corners of said eyes and middle positions of eyelids of said eyes;
      
      four points situated at both sides of wing of said nose, a position of a bridge of said nose and a position of a tip of said nose; and
      
      12 points situated equidistantly along external edges of a lip of said mouth.

10. A method for driving virtual facial expressions by automatically detecting facial expressions of a face image, which is applied to a digital image capturing device, comprising:
- obtaining a key point distribution model through learning from a plurality of front face image samples, wherein said key point distribution model is used to represent a distribution of statistic and positions of each key point of a face, eyes, a nose and a mouth of each of said front face image samples on a standard image for imitating a virtual front face;
  
  automatically detecting positions of a face, eyes, a nose and a mouth in a target image captured by said digital image capturing device, and converting said positions onto said standard image;
  
  determining whether or not said positions of said key points on said standard image deviate from an average position calculated from said statistics and positions of said key point distribution model;
  
  if yes, then said position of said key point on said standard image is pulled back to an accurate position close to said average position;
  
  performing a calculation for said positions of said eyes, nose and mouth of said target image and said distribution of statistics and positions corresponding thereto to obtain initial positions of key points of eyes, nose and mouth of said target image on said standard image;
  
  using said initial positions as exact positions of said key points of said target image on said standard image, and inversely aligning said exact positions onto said target image, and labeling said exact positions as exact positions of said key points on said target image; and
  
  automatically tracking positions of key points of other target image captured by said digital image capturing device later, and correcting said exact position of said key point on said standard image.
- View Dependent Claims (11)
- - 11. The method of claim 10 further comprises:
    - defining 28 points as said key points of said target image or said other target image, wherein said key points includes four points situated at a highest position of an upper edge of each of two eyebrows of said eyes and tips of internal sides of said upper edges;
      
      eight points situated at corners of said eyes and middle positions of eyelids of said eyes;
      
      four points situated at both sides of wing of said nose, a position of a bridge of said nose and a position of a tip of said nose; and
      
      12 points situated equidistantly along external edges of a lip of said mouth.

12. A method for driving virtual facial expressions by automatically detecting facial expressions of a face image, which is applied to a digital image capturing device, comprising:
- obtaining a key point distribution model through learning from a plurality of front face image samples, wherein said key point distribution model is used to represent a distribution of statistic and positions of each key point of a face, eyes, a nose and a mouth of each of said front face image samples on a standard image for imitating a virtual front face, wherein said standard image is a greyscale figure having a fixed size and using a front face as a center without an inclination;
  
  automatically detecting positions of a face, eyes, a nose and a mouth in a target image captured by said digital image capturing device, and converting said positions onto said standard image;
  
  determining whether or not said positions of said key points on said standard image deviate from an average position calculated from said statistics and positions of said key point distribution model;
  
  if yes, then said position of said key point on said standard image is pulled back to an accurate position close to said average position;
  
  performing a calculation for said positions of said eyes, nose and mouth of said target image and said distribution of statistics and positions corresponding thereto to obtain initial positions of key points of eyes, nose and mouth of said target image on said standard image;
  
  using said initial positions as exact positions of said key points of said target image on said standard image, and inversely aligning said exact positions of said key points on said standard image onto said target image, and labeling said exact positions as exact positions of said key points on said target image; and
  
  automatically tracking positions of key points of other target image captured by said digital image capturing device later, and correcting said exact position of said key point on said standard image.
- View Dependent Claims (13)
- - 13. The method of claim 12 further comprises:
    - defining 28 points as said key points of said target image or said other target image, wherein said key points includes four points situated at a highest position of an upper edge of each of two eyebrows of said eyes and tips of internal sides of said upper edges;
      
      eight points situated at corners of said eyes and middle positions of eyelids of said eyes;
      
      four points situated at both sides of wing of said nose, a position of a bridge of said nose and a position of a tip of said nose; and
      
      12 points situated equidistantly along external edges of a lip of said mouth.

14. A method for driving virtual facial expressions by automatically detecting facial expressions of a face image, which is applied to a digital image capturing device, comprising:
- learning and correcting a plurality of front face image samples to obtain an average position of each key point of a face, eyes, a nose and a mouth of each of said front face image samples on a standard image for imitating a virtual front face, wherein said standard image is a greyscale figure having a fixed size and using a front face as a center without an inclination;
  
  obtaining a key point distribution model through learning from said front face image samples, wherein said key point distribution model is used to represent a distribution of statistic and positions of said key points of said front face image samples on said standard image;
  
  automatically detecting positions of a face, eyes, a nose and a mouth in a target image captured by said digital image capturing device, and converting said positions onto said standard image;
  
  determining whether or not said positions of said key points on said standard image deviate from an average position calculated from said statistics and positions of said key point distribution model;
  
  if yes, then said position of said key point on said standard image is pulled back to an accurate position close to said average position calculated from said statistics and positions of said key point distribution model;
  
  performing a fitting or regression calculation for said positions of eyes, nose and mouth of said target image and said average positions calculated from said statistics and positions of said key point distribution model and corresponding thereto to obtain initial positions of key points of eyes, nose and mouth of said target image on said standard image;
  
  using said initial positions as exact positions of said key points of said target image on said standard image, and inversely aligning said exact positions of said key points on said standard image onto said target image, and labeling said exact positions as exact positions of said key points on said target image; and
  
  automatically tracking positions of key points of other target image captured by said digital image capturing device later, and correcting said exact position of said key point on said standard image.
- View Dependent Claims (15)
- - 15. The method of claim 14 further comprises:
    - defining 28 points as said key points of said target image or said other target image, wherein said key points includes four points situated at a highest position of an upper edge of each of two eyebrows of said eyes and tips of internal sides of said upper edges;
      
      eight points situated at corners of said eyes and middle positions of eyelids of said eyes;
      
      four points situated at both sides of wing of said nose, a position of a bridge of said nose and a position of a tip of said nose; and
      
      12 points situated equidistantly along external edges of a lip of said mouth.

16. A method for driving virtual facial expressions by automatically detecting facial expressions of a face image, which is applied to a digital image capturing device, comprising:
- learning and correcting a plurality of front face image samples to obtain an average position of each key point of a face, eyes, a nose and a mouth of each of said front face image samples on a standard image for imitating a virtual front face, wherein said standard image is a greyscale figure having a fixed size and using a front face as a center without an inclination;
  
  obtaining a key point distribution model through learning from said front face image samples, wherein said key point distribution model is used to represent a distribution of statistic and positions of said key points of said front face image samples on said standard image;
  
  using a Gabor wavelet algorithm to sample a series of Gabor Jets from said key points of said front face image samples to form a Gabor Jet Bunch;
  
  automatically detecting positions of a face, eyes, a nose and a mouth in a target image captured by said digital image capturing device, and converting said positions onto said standard image;
  
  determining whether or not said positions of said key points on said standard image deviate from an average position calculated from said statistics and positions of said key point distribution model;
  
  if yes, then said position of said key point on said standard image is pulled back to an accurate position close to said average position calculated from said statistics and positions of said key point distribution model;
  
  performing a fitting or regression calculation for said positions of eyes, nose and mouth of said target image and said average positions calculated from said statistics and positions of said key point distribution model and corresponding thereto to obtain initial positions of key points of eyes, nose and mouth of said target image on said standard image;
  
  calculating exact positions of said key points of said target image on said standard image by using a point within a neighborhood of each of said initial positions as a selecting point, comparing a Gabor Jet of each of said initial positions with each of said Gabor Jets in said Gabor Jet Bunch, selecting said Gabor Jet in said Gabor Jet Bunch having a highest similarity with said selecting point as said exact position of said key point on said standard image corresponding to said key point of said target image, inversely aligning said exact positions of said key points on said standard image onto said target image, and labeling said exact positions as exact positions of said key points on said target image; and
  
  automatically tracking positions of key points of other target image captured by said digital image capturing device later, and correcting said exact position of said key point on said standard image.
- View Dependent Claims (17)
- - 17. The method of claim 16 further comprises:
    - defining 28 points as said key points of said target image or said other target image, wherein said key points includes four points situated at a highest position of an upper edge of each of two eyebrows of said eyes and tips of internal sides of said upper edges;
      
      eight points situated at corners of said eyes and middle positions of eyelids of said eyes;
      
      four points situated at both sides of wing of said nose, a position of a bridge of said nose and a position of a tip of said nose; and
      
      12 points situated equidistantly along external edges of a lip of said mouth.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
ArcSoft, Inc.
Original Assignee
ArcSoft, Inc.
Inventors
Chen, Ji, Wang, Jin, Wang, Lin
Primary Examiner(s)
Mehta; Bhavesh M
Assistant Examiner(s)
DULANEY, KATHLEEN YUAN

Application Number

US11/501,009
Publication Number

US 20080037836A1
Time in Patent Office

1,427 Days
Field of Search

382/181, 382/118, 382/103, 382/192, 382/215
US Class Current

382/118
CPC Class Codes

G06T 13/40   of characters, e.g. humans,...

G06V 10/451   with interaction between th...

G06V 40/171   Local features and componen...

G06V 40/176   Dynamic expression

Method for driving virtual facial expressions by automatically detecting facial expressions of a face image

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

44 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Method for driving virtual facial expressions by automatically detecting facial expressions of a face image

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

44 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links