Avatar-Mediated Telepresence Systems with Enhanced Filtering
First Claim
1. A system, comprising:
- input devices which capture audio and video streams from a first user'"'"'s actual appearance and movements;
a first computing system which receives video and audio data from the input devices, and accordingly generates, according to a known model, an animated photorealistic 3D avatar with trajectories and cues for animation, which substantially replicates appearance, gestures, and inflections of the first user in real time; and
a second computing system, remote from said first computing system, which uses said trajectories and cues to reconstruct a photorealistic real-time 3D avatar, in accordance with the known model, which varies, in accordance with said trajectories and cues, to match the appearance, gestures, inflections of the first user, and outputs said avatar to be shown on a display to a second user;
wherein the known model includes time-dependent trajectories for at least some elements of the user'"'"'s dynamically simulated appearance.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems using photorealistic avatars to provide live interaction. Several groups of innovations are described. In one such group, trajectory information included with the avatar model makes the model 4D rather than 3D. In another group, a fallback representation is provided with deliberately-low quality. In another group, avatar fidelity is treated as a security requirement. In another group, avatar representation is driven by both video and audio inputs, and audio output depends on both video and audio input. In another group, avatar representation is updated while in use, to refine representation by a training process. In another group, avatar representation uses the best-quality input to drive avatar animation when more than one input is available, and swapping to a secondary input while the primary input is insufficient. In another such group, the avatar representation can be paused or put into a standby mode.
291 Citations
17 Claims
-
1. A system, comprising:
-
input devices which capture audio and video streams from a first user'"'"'s actual appearance and movements; a first computing system which receives video and audio data from the input devices, and accordingly generates, according to a known model, an animated photorealistic 3D avatar with trajectories and cues for animation, which substantially replicates appearance, gestures, and inflections of the first user in real time; and a second computing system, remote from said first computing system, which uses said trajectories and cues to reconstruct a photorealistic real-time 3D avatar, in accordance with the known model, which varies, in accordance with said trajectories and cues, to match the appearance, gestures, inflections of the first user, and outputs said avatar to be shown on a display to a second user; wherein the known model includes time-dependent trajectories for at least some elements of the user'"'"'s dynamically simulated appearance. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method, comprising:
-
capturing audio and video streams from a first user'"'"'s actual appearance and movements, and accordingly generating, according to a known model, a first animated photorealistic 3D avatar which, with associated trajectories and cues for animation, substantially replicates gestures, inflections, and general appearance of the first user in real time; and
transmitting the trajectories and cues for animation; andreceiving, from a second computing system, trajectories and cues to reconstruct a second photorealistic real-time 3D avatar in accordance with the known model, and reconstructing the second avatar, and displaying the reconstructed avatar to the first user; wherein the known model includes time-dependent trajectories for at least some elements of a user'"'"'s dynamically simulated appearance. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A system, comprising:
-
input devices which capture audio and video streams from a first user'"'"'s actual appearance and movements; a first computing system which receives video and audio data from the input devices, and accordingly generates, according to a known model, a data stream which uses a known avatar model to define an animated photorealistic 3D avatar which replicates gestures, inflections, and general appearance of the first user in real time; and a second computing system, remote from said first computing system, which uses said data stream and said known model to reconstruct a photorealistic real-time 3D avatar which replicates gestures, inflections, and general appearance of the first user, and outputs said avatar to be shown on a display to a second user; wherein, during normal operation, the second computing system outputs said avatar with photorealism which is greater than the maximum of the uncanny valley; and
wherein, if normal operation is impeded, the second computing system either outputs said avatar with photorealism which is less than the minimum of the uncanny valley, or else outputs trajectory and cues that have been predefined in sequence for such purpose. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17-67. -67. (canceled)
Specification