VOICE EFFECTS BASED ON FACIAL EXPRESSIONS
First Claim
1. A method, comprising:
- at an electronic device having at least a camera and a microphone;
displaying a virtual avatar generation interface;
displaying first preview content of a virtual avatar in the virtual avatar generation interface, the first preview content of the virtual avatar corresponding to realtime preview video frames of a user headshot in a field of view of the camera and associated headshot changes in an appearance;
while displaying the first preview content of the virtual avatar, detecting an input in the virtual avatar generation interface;
in response to detecting the input in the virtual avatar generation interface;
capturing, via the camera, a video signal associated with the user headshot during a recording session;
capturing, via the microphone, a user audio signal during the recording session;
extracting audio feature characteristics from the captured user audio signal; and
extracting facial feature characteristics associated with the face from the captured video signal; and
in response to detecting expiration of the recording session;
generating an adjusted audio signal from the captured audio signal based at least in part on the facial feature characteristics and the audio feature characteristics;
generating second preview content of the virtual avatar in the virtual avatar generation interface according to the facial feature characteristics and the adjusted audio signal; and
presenting the second preview content in the virtual avatar generation interface.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present disclosure can provide systems, methods, and computer-readable medium for adjusting audio and/or video information of a video clip based at least in part on facial feature and/or voice feature characteristics extracted from hardware components. For example, in response to detecting a request to generate an avatar video clip of a virtual avatar, a video signal associated with a face in a field of view of a camera and an audio signal may be captured. Voice feature characteristics and facial feature characteristics may be extracted from the audio signal and the video signal, respectively. In some examples, in response to detecting a request to preview the avatar video clip, an adjusted audio signal may be generated based at least in part on the facial feature characteristics and the voice feature characteristics, and a preview of the video clip of the virtual avatar using the adjusted audio signal may be displayed.
61 Citations
20 Claims
-
1. A method, comprising:
at an electronic device having at least a camera and a microphone; displaying a virtual avatar generation interface; displaying first preview content of a virtual avatar in the virtual avatar generation interface, the first preview content of the virtual avatar corresponding to realtime preview video frames of a user headshot in a field of view of the camera and associated headshot changes in an appearance; while displaying the first preview content of the virtual avatar, detecting an input in the virtual avatar generation interface; in response to detecting the input in the virtual avatar generation interface; capturing, via the camera, a video signal associated with the user headshot during a recording session; capturing, via the microphone, a user audio signal during the recording session; extracting audio feature characteristics from the captured user audio signal; and extracting facial feature characteristics associated with the face from the captured video signal; and in response to detecting expiration of the recording session; generating an adjusted audio signal from the captured audio signal based at least in part on the facial feature characteristics and the audio feature characteristics; generating second preview content of the virtual avatar in the virtual avatar generation interface according to the facial feature characteristics and the adjusted audio signal; and presenting the second preview content in the virtual avatar generation interface. - View Dependent Claims (2, 3, 4)
-
5. An electronic device, comprising:
-
a camera; a microphone; and one or more processors in communication with the camera and the microphone, the one or more processors configured to; while displaying a first preview of a virtual avatar, detecting an input in a virtual avatar generation interface; in response to detecting the input in the virtual avatar generation interface, initiating a capture session including; capturing, via the camera, a video signal associated with a face in a field of view of the camera; capturing, via the microphone, an audio signal associated with the captured video signal; extracting audio feature characteristics from the captured audio signal; and extracting facial feature characteristics associated with the face from the captured video signal; and in response to detecting expiration of the capture session; generating an adjusted audio signal based at least in part on the audio feature characteristics and the facial feature characteristics; and displaying a second preview of the virtual avatar in the virtual avatar generation interface according to the facial feature characteristics and the adjusted audio signal. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer-readable storage medium storing computer-executable instructions that, when executed by one or more processors, configure the one or more processors to perform operations comprising:
-
in response to detecting a request to generate an avatar video clip of a virtual avatar; capturing, via a camera of an electronic device, a video signal associated with a face in a field of view of the camera; capturing, via a microphone of the electronic device, an audio signal; extracting voice feature characteristics from the captured audio signal; and extracting facial feature characteristics associated with the face from the captured video signal; and in response to detecting a request to preview the avatar video clip; generating an adjusted audio signal based at least in part on the facial feature characteristics and the voice feature characteristics; and displaying a preview of the video clip of the virtual avatar using the adjusted audio signal. - View Dependent Claims (17, 18, 19, 20)
-
Specification