Periocular and audio synthesis of a full face image
First Claim
Patent Images
1. A wearable system for animating a user'"'"'s face during speech, the wearable system comprising:
- an inward-facing imaging system configured to capture images, the inward-facing imaging system comprising one or more cameras positioned such that, when the wearable system is worn by the user, a periocular region of a user'"'"'s face is observable by the inward-facing imaging system and the user'"'"'s lower face is unobservable by the inward-facing imaging system;
an audio sensor configured to receive the user'"'"'s speech;
a hardware processor programmed to;
acquire an image, via the inward-facing imaging system when the wearable system is worn by the user, of the periocular region of the user;
generate, based at least partly on the image of the periocular region of the user, periocular face parameters encoding a periocular conformation of at least the periocular region of the user;
acquire, by the audio sensor, an audio stream spoken by the user;
identify a phoneme in the audio stream;
access a base model that was generated using images associated with a group of people not including the user;
customize a mapping based at least in part on the base model and the image of the periocular region of the user,wherein an input of the mapping comprises the phoneme and the image of the periocular region of the user, andwherein an output of the mapping comprises lower face parameters that encode a conformation of the lower face of the user and that are deduced from an analysis of the phoneme and the image of the periocular region of the user;
apply the mapping to the image of the periocular region of the user to generate the lower face parameters;
combine the periocular face parameters and the lower face parameters to generate full face parameters associated with a three-dimensional (3D) face model; and
generate an animation of the user'"'"'s face based at least in part on the full face parameters.
3 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for synthesizing an image of the face by a head-mounted device (HMD) are disclosed. The HMD may not be able to observe a portion of the face. The systems and methods described herein can generate a mapping from a conformation of the portion of the face that is not imaged to a conformation of the portion of the face observed. The HMD can receive an image of a portion of the face and use the mapping to determine a conformation of the portion of the face that is not observed. The HMD can combine the observed and unobserved portions to synthesize a full face image.
-
Citations
23 Claims
-
1. A wearable system for animating a user'"'"'s face during speech, the wearable system comprising:
-
an inward-facing imaging system configured to capture images, the inward-facing imaging system comprising one or more cameras positioned such that, when the wearable system is worn by the user, a periocular region of a user'"'"'s face is observable by the inward-facing imaging system and the user'"'"'s lower face is unobservable by the inward-facing imaging system; an audio sensor configured to receive the user'"'"'s speech; a hardware processor programmed to; acquire an image, via the inward-facing imaging system when the wearable system is worn by the user, of the periocular region of the user; generate, based at least partly on the image of the periocular region of the user, periocular face parameters encoding a periocular conformation of at least the periocular region of the user; acquire, by the audio sensor, an audio stream spoken by the user; identify a phoneme in the audio stream; access a base model that was generated using images associated with a group of people not including the user; customize a mapping based at least in part on the base model and the image of the periocular region of the user, wherein an input of the mapping comprises the phoneme and the image of the periocular region of the user, and wherein an output of the mapping comprises lower face parameters that encode a conformation of the lower face of the user and that are deduced from an analysis of the phoneme and the image of the periocular region of the user; apply the mapping to the image of the periocular region of the user to generate the lower face parameters; combine the periocular face parameters and the lower face parameters to generate full face parameters associated with a three-dimensional (3D) face model; and generate an animation of the user'"'"'s face based at least in part on the full face parameters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for animating a user'"'"'s face during speech, the method comprising:
-
accessing an image of the periocular region of a user acquired by an inward-facing imaging system configured to capture images, the inward-facing imaging system comprising one or more cameras positioned such that, when the wearable system is worn by the user, a periocular region of the user'"'"'s face is observable by the inward-facing imaging system and the user'"'"'s lower face is unobservable by the inward-facing imaging system; determining, based at least partly on the images, periocular face parameters encoding a periocular conformation of at least the periocular region of the user; accessing an audio stream spoken by the user acquired by an audio sensor; identifying a phoneme in the audio stream; accessing a base model that was generated using images associated with a group of people not including the user; customizing a mapping based at least in part on the base model and the image of the periocular region of the user, wherein an input of the mapping comprises the phoneme and the image of the periocular region of the user, and wherein an output of the mapping comprises lower face parameters that encode a conformation of the lower face of the user, and that are deduced from an analysis of the phoneme and the image of the periocular region of the user; applying the mapping to the image to generate the lower face parameters; combining the periocular face parameters and the lower face parameters to generate full face parameters associated with a three-dimensional (3D) face model; and generating a full face image based at least partly on the full face parameters. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
Specification