GENERATION OF A THREE-DIMENSIONAL REPRESENTATION OF A USER

US 20140160123A1
Filed: 12/12/2012
Published: 06/12/2014
Est. Priority Date: 12/12/2012
Status: Active Grant

First Claim

Patent Images

1. A method that facilitates constructing a computer-implemented three-dimensional representation of a head of a user, the method comprising:

receiving a plurality of RGB frames of the head of the user from a camera, the plurality of RGB frames captured by the camera over a range of time;

receiving a plurality of depth frames from a depth sensor, the depth frames being indicative of distances of respective portions of the head of the user from the depth sensor, the depth frames generated by the depth sensor over the range of time;

identifying at least one feature of the head of the user in the plurality of RGB frames, the at least one feature being one of a center of an eye of the user, a center of a nose of the user, a first nasal alar of the user, or a second nasal alar of the user;

generating a three-dimensional mesh of the head of the user based at least in part upon the plurality of depth frames and the identifying of the at least one feature of the head of the user in the RGB frames; and

texturizing the three-dimensional mesh based at least in part upon the plurality of RGB frames.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Described herein are technologies pertaining to generating a relatively accurate virtual three-dimensional model of a head/face of a user. Depth frames are received from a depth sensor and color frames are received from a camera, wherein such frames capture a head of a user. Based upon the depth frames and the color frames, the three-dimensional model of the head of the user is generated.

Citations

20 Claims

1. A method that facilitates constructing a computer-implemented three-dimensional representation of a head of a user, the method comprising:
- receiving a plurality of RGB frames of the head of the user from a camera, the plurality of RGB frames captured by the camera over a range of time;
  
  receiving a plurality of depth frames from a depth sensor, the depth frames being indicative of distances of respective portions of the head of the user from the depth sensor, the depth frames generated by the depth sensor over the range of time;
  
  identifying at least one feature of the head of the user in the plurality of RGB frames, the at least one feature being one of a center of an eye of the user, a center of a nose of the user, a first nasal alar of the user, or a second nasal alar of the user;
  
  generating a three-dimensional mesh of the head of the user based at least in part upon the plurality of depth frames and the identifying of the at least one feature of the head of the user in the RGB frames; and
  
  texturizing the three-dimensional mesh based at least in part upon the plurality of RGB frames.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein a sensor housing comprises both the camera and the depth sensor.
  - 3. The method of claim 1, wherein resolution of the RGB frames is less than 1280 pixels in length and 960 pixels in height.
  - 4. The method of claim 1, wherein resolution of the depth frames is less than 1280 pixels in length by 960 pixels in height.
  - 5. The method of claim 1, wherein the plurality of RGB frames and the plurality of depth frames are captured when the head of the user is stationary.
  - 6. The method of claim 1, wherein the plurality of RGB frames and the plurality of depth frames are captured as the head of the user is rotated.
  - 7. The method of claim 1, further comprising aligning depth frames, wherein the aligning of the depth frames is based at least in part upon the identifying of the at least one feature in the RGB frames of the user.
  - 8. The method of claim 1, further comprising:
    - responsive to receiving at least one RGB frame from the plurality of RGB frames and at least one depth frame from the plurality of depth frames, estimating a three-dimensional pose of the head of the user; and
      
      outputting an instruction to the user based at least in part upon the estimating of the three-dimensional pose of the head of the user.
  - 9. The method of claim 1, further comprising automatically animating the three-dimensional model of the user subsequent to texturizing the three-dimensional mesh.
  - 10. The method of claim 1, further comprising:
    - subsequent to receiving the plurality of depth frames from the depth sensor, selecting a PCA basis model; and
      
      morphing the PCA basis model based at least in part upon the plurality of depth frames from the depth sensor and the identifying of the at least one feature of the head of the user in the RGB frames.
  - 11. The method of claim 1, wherein texturizing the three-dimensional mesh based at least in part upon the plurality of RGB frames comprises:
    - estimating skin color of the user in the RGB frames.

12. A system, comprising:
- a processor; and
  
  a memory that comprises a plurality of components that are executed by the processor, the plurality of components comprising;
  
  a receiver component that receives;
  
  a plurality of RGB frames captured by a RGB camera over a range of time, each RGB frame in the plurality of RGB frames comprising an image of a head of a user; and
  
  a plurality of depth frames captured by a depth sensor over the range of time, each depth frame in the plurality of depth frames comprising a depth map of the head of the user; and
  
  a model generator component that;
  
  identifies facial features of the user in the plurality of RGB frames;
  
  aligns depth frames in the plurality of depth frames with one another based at least in part upon the facial features of the user identified in the plurality of RGB frames; and
  
  generates an animated three-dimensional model of the head of the user based at least in part upon the depth frames that are in alignment with one another.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
- - 13. The system of claim 12 comprised by a gaming console.
  - 14. The system of claim 12, wherein the processor, the memory, the RGB camera, and the depth sensor are comprised by a personal computing device.
  - 15. The system of claim 12, wherein the model generator component comprises:
    - a face tracker component that analyzes each RGB frame and identifies the facial features of the user therein, wherein the facial features comprise at least one of a center of an eye of the user, a center of a nose of the user, a first nasal alar of the user, or a second nasal alar of the user; and
      
      a model constructor component that selects a template wire mesh that is representative of a three-dimensional head based at least in part upon the plurality of depth frames and modifies the three-dimensional head based at least in part upon the facial features of the user identified by the face tracker component.
  - 16. The system of claim 12, wherein the model generator component comprises:
    - an image pre-processor component that filters noise from each RGB frame in the plurality of RGB frames; and
      
      a texturizer component that receives processed RGB frames from the image pre-processor component and the three-dimensional model from the model generator component and texturizes the three-dimensional model.
  - 17. The system of claim 12, wherein the model generator component comprises a hair generator component that selects hair for placement on the animated three-dimensional model.
  - 18. The system of claim 17, wherein the model generator component further comprises a texture smoother component that smooths texture across the animated three-dimensional model.
  - 19. The system of claim 12 comprised by a mobile computing device.

20. A computer-readable medium comprising instructions that, when executed by a processor, cause the processor to perform acts comprising:
- receiving a plurality of RGB frames from an RGB camera over a range of time, the RGB frames capturing a head of a user;
  
  receiving a plurality of depth frames from a depth sensor over the range of time, the depth frames capturing the head of the user;
  
  identifying at least one feature of the user in the plurality of RGB frames, the at least one feature being one of a center of an eye of the user, a center of a nose of the user, a first nasal alar of the user, or a second nasal alar of the user;
  
  generating a three-dimensional point cloud corresponding to the head of the user based at least in part upon the plurality of depth frames and the identifying of the at least one feature of the user in the plurality of RGB frames;
  
  selecting a template head model from a library of template head models based upon the three-dimensional point cloud;
  
  refining the head model based at least in part upon the at least one feature of the user in the plurality of RGB frames; and
  
  subsequent to refining the head model, texturing the head model based at least in part upon the plurality of RGB frames.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Yang, Fan, Gong, Minmin, Tong, Xin, Liu, Zicheng

Granted Patent

US 9,552,668 B2
Time in Patent Office

Days
Field of Search
US Class Current

345/420
CPC Class Codes

G06T 17/00 Three dimensional [3D] mode...

GENERATION OF A THREE-DIMENSIONAL REPRESENTATION OF A USER

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

GENERATION OF A THREE-DIMENSIONAL REPRESENTATION OF A USER

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links