Efficient and accurate 3D object tracking

US 8,848,975 B2
Filed: 09/04/2009
Issued: 09/30/2014
Est. Priority Date: 03/05/2007
Status: Active Grant

First Claim

Patent Images

1. A method of tracking a location of a face in an input image stream, said method comprising iteratively applying the steps of:

(a) performing a computerized three-dimensional (3D) to two-dimensional (2D) rendering of a predefined textured 3D face model indicative of at least a portion of a face to produce a 2D rendered image including said face, the 3D-to-2D rendering being performed according to a previously predicted state vector derived from a previous tracking loop or a state vector from an initialisation step, wherein said state vector comprises the pose of said face;

(b) processing said 2D rendered image to extract a series of point features from the portion of the face in said 2D rendered image;

(c) localising corresponding point features in a current 2D image of said input image stream by comparing said 2D rendered image with said current 2D image;

(d) deriving a new state vector from said localised point feature in the input image stream, the new state vector being indicative of a current location of said face within said input image stream.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of tracking an object in an input image stream, the method comprising iteratively applying the steps of: (a) rendering a three-dimensional object model according to a previously predicted state vector from a previous tracking loop or the state vector from an initialization step; (b) extracting a series of point features from the rendered object; (c) localizing corresponding point features in the input image stream; (d) deriving a new state vector from the point feature locations in the input image stream.

Citations

15 Claims

1. A method of tracking a location of a face in an input image stream, said method comprising iteratively applying the steps of:
- (a) performing a computerized three-dimensional (3D) to two-dimensional (2D) rendering of a predefined textured 3D face model indicative of at least a portion of a face to produce a 2D rendered image including said face, the 3D-to-2D rendering being performed according to a previously predicted state vector derived from a previous tracking loop or a state vector from an initialisation step, wherein said state vector comprises the pose of said face;
  
  (b) processing said 2D rendered image to extract a series of point features from the portion of the face in said 2D rendered image;
  
  (c) localising corresponding point features in a current 2D image of said input image stream by comparing said 2D rendered image with said current 2D image;
  
  (d) deriving a new state vector from said localised point feature in the input image stream, the new state vector being indicative of a current location of said face within said input image stream.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. A method as claimed in claim 1 further comprising an initialisation step of(i) creation of a three-dimensional face model to be tracked;
    - (ii) initial detection of said face'"'"'s position within said input image stream.
  - 3. A method as claimed in claim 1 wherein said step (d) further includes a step of selecting a set of consistent features followed by Kalman filtering of said features for deriving said new state vector.
  - 4. A method as claimed in claim 3 wherein the step of selecting of a set of consistent features comprises random sample consensus (RANSAC) selection of said features.
  - 5. A method as claimed in claim 1 wherein said state vector comprises the three-dimensional pose of said face.
  - 6. A method as claimed in claim 1 wherein said localising step includes utilising a normalised cross-correlation process to localise said point features.

7. A method of tracking an object in an input image stream, the method comprising steps of:
- (i) creating a three-dimensional model of said object to be tracked;
  
  (ii) localising initial features points in an initial said input image stream;
  
  (iii) calculating an initial state vector indicative of a location of said object within said input image stream, wherein said initial state vector is calculated by minimising the square error between localised said initial features points and corresponding initial features points of said three-dimensional model projected into an image plane;
  
  (a) rendering said three-dimensional object model, wherein said object model accords with either a predicted state vector calculated in step (d) of a previous iteration or said initial state vector calculated in step (iii), wherein the rendering includes calculating a mask for said input image stream to distinguish between background and foreground pixels;
  
  (b) calculating a predefined number of point features from said object, wherein a corresponding predefined number of locations having highest edginess is selected as features from an image of said input image stream corresponding to the previous iteration for the following localisation step;
  
  (c) localising corresponding point features in said input image stream;
  
  (d) calculating a new state vector from localised said features points in said input image stream; and
  
  (e) iteratively performing steps (a) though (d) for providing at each iteration updated said new state vector from localised said features points.
- View Dependent Claims (8, 9, 10)
- - 8. A method according to claim 7, wherein the step of localising initial features points uses a masked normalised template correlation to calculate new localised said features points.
  - 9. A method according to claim 8, wherein said new state vector is calculated by using Random Sample Consensus selection and followed by a Kalman filter for estimating said new state vector.
  - 10. A method according to claim 9, wherein said Kalman filter utilises a constant acceleration motion model of said object to predict said new state vector of said object from state vectors calculated in previous iterations and said new localised features points.

11. A computer system including a non-transitory medium programmed with a set of executable instructions for carrying out a method of tracking a location of a face in an input image stream, said method comprising iteratively applying the steps of:
- (a) performing a computerized three-dimensional (3D) to two-dimensional (2D) rendering of a predefined textured 3D face model indicative of at least a portion of a face to produce 2D a rendered image including said face, the 3D-to-2D rendering being performed according to a previously predicted state vector derived from a previous tracking loop or a state vector from an initialisation step, wherein said state vector comprises the pose of said face;
  
  (b) processing said 2D rendered image to extract series of point features from said face in said rendered 2D image;
  
  (c) localising corresponding point features in a current 2D image of said input image stream by comparing said 2D rendered image with said current 2D image;
  
  (d) deriving a new state vector from said localised point feature in the input image stream, the new state vector being indicative of a current location of said face within said input image stream.

12. A non-transitory, tangible computer-readable carrier medium carrying a set of instructions that when executed by one or more processors cause one or more processors to carry out a method of tracking a location of a face in an input image stream, said method comprising iteratively applying the steps of:
- (a) performing a computerized three-dimensional (3D) to two-dimensional (2D) rendering of a textured 3D face model indicative of at least a portion of a face to produce a 2D rendered image including said face, the 3D-to-2D rendering being performed according to a previously predicted state vector derived from a previous tracking loop or a state vector from an initialisation step, wherein said state vector comprises the pose of said face;
  
  (b) processing said 2D rendered image to extract series of point features from said face in said rendered 2D image;
  
  (c) localising corresponding point features in a current 2D image of said input image stream by comparing the 2D rendered image with said current 2D image;
  
  (d) deriving a new state vector from said localised point feature in the input image stream, the new state vector being indicative of a current location of said face within said input image stream.

13. A system for tracking an object in an input image stream, the system comprising a processor adapted to receive an input image stream, said processor is further adapted to perform the steps of:
- (i) creating a three-dimensional model of said object to be tracked;
  
  (ii) localising initial features points in an initial said input image stream;
  
  (iii) calculating an initial state vector indicative of a location of said object within said input image stream, wherein said initial state vector is calculated by minimising the square error between localised said initial features points and corresponding initial feature points of said three-dimensional model projected into an image plane;
  
  (a) rendering said three-dimensional object model, wherein said object model accords with either a predicted state vector calculated in step (d) of a previous iteration or said initial state vector calculated in step (iii), wherein the rendering includes calculating a mask for said input image stream to distinguish between background and foreground pixels;
  
  (b) calculating a predefined number of features points from said object, wherein a corresponding predefined number of locations having highest edginess is selected as features from an image of said input image stream corresponding to the previous iteration for the following localisation step;
  
  (c) localising corresponding point features in said input image stream;
  
  (d) calculating a new state vector from localised said features points in said input image stream; and
  
  (e) iteratively performing steps (a) though (d) for providing at each iteration updated said new state vector from localised said features points.
- View Dependent Claims (14, 15)
- - 14. A system according to claim 13, wherein said processor is adapted to perform a masked normalised template correlation for calculating new localised features points.
  - 15. A system according to claim 14, wherein said processor is adapted to apply said new localised features points into a Kalman filter for calculating said new state vector.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Seeing Machines LTD
Original Assignee
Seeing Machines Pty Limited
Inventors
Tell, Dennis
Primary Examiner(s)
Bali, Vikkram

Application Number

US12/554,625
Publication Number

US 20090324018A1
Time in Patent Office

1,852 Days
Field of Search

None
US Class Current

382/103
CPC Class Codes

G06T 2207/10016   Video; Image sequence

G06T 2207/30201   Face

G06T 7/251   involving models

G06T 7/277   involving stochastic approa...

G06T 7/75   involving models

G06V 20/647   by matching two-dimensional...

G06V 40/165   using facial parts and geom...

Efficient and accurate 3D object tracking

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Efficient and accurate 3D object tracking

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links