Method for generating mouth features of an animated or physical character
First Claim
1. A method for determining the mouth features for a speaking character, comprising the steps of:
- sampling a time-domain audio signal;
separating the time-domain audio signal into a plurality of frames;
applying a window to each of the plurality of frames; and
applying a linear predictive coding (LPC) technique to each of the plurality of frames to achieve a plurality of LPC coefficients and a gain for each of the plurality of frames, whereby the LPC coefficients and gain for each frame are used to determine the mouth features for the character on a frame-by-frame basis.
7 Assignments
0 Petitions
Accused Products
Abstract
A method and system for determining the mouth features, i.e., the lip position and mouth opening, of an animated character. Lip position is the shape and position of the lips of the animated character. Mouth opening is the amount of opening between the lips of the animated character. A time-domain signal corresponding to the speech of the animated character may be digitally sampled. The sampled voice signal is separated into a number of frames of a specific time length. A Hamming window is applied to each frame to de-emphasize the boundary conditions of each frame. A linear predictive coding (LPC) technique is applied to each of the frames, resulting in a gain for each of the frames and a number of k coefficients, or reflection coefficients, including a voiced/nonvoiced coefficient and a pitch coefficient. The reflection coefficients for each frame are mapped to the Cepstral domain resulting in a number of Cepstral coefficients for each frame. The Cepstral coefficients are vector quantized to achieve a vector quantization result representing the character'"'"'s lip position. For a predetermined number of frames, a local maximum and a local minimum of gain are found. The gain for each of the frames containing a local minimum is set to a fully closed mouth opening and the gain for each of the frames containing a local maximum is set to a fully open mouth opening. The vector quantization result and gain are applied to an empirically derived mapping function to determine the mouth features of the character.
-
Citations
20 Claims
-
1. A method for determining the mouth features for a speaking character, comprising the steps of:
-
sampling a time-domain audio signal; separating the time-domain audio signal into a plurality of frames; applying a window to each of the plurality of frames; and applying a linear predictive coding (LPC) technique to each of the plurality of frames to achieve a plurality of LPC coefficients and a gain for each of the plurality of frames, whereby the LPC coefficients and gain for each frame are used to determine the mouth features for the character on a frame-by-frame basis. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-implemented method for generating mouth features of a character, comprising the steps of:
-
sampling a time-domain voice signal; separating the time-domain voice signal into a plurality of frames; applying a windowing technique to each frame; applying a linear predictive coding (LPC) technique to each of the plurality of frames to generate a plurality of LPC coefficients and a gain for each frame; mapping the plurality of LPC coefficients to the Cepstral domain to obtain a plurality of Cepstral coefficients for each frame; vector quantizing the Cepstral coefficients to obtain a lip position for each frame; determining a local maximum of the gain and a local minimum of the gain within a predetermined number of frames; adjusting the gain for the frame containing the local minimum to equal a minimum gain level; adjusting the gain for the frame containing the local maximum to equal a maximum gain level; and applying the lip position and the gain for each frame to an empirically derived mapping function to obtain the mouth features of the character for each frame. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A computer system for synchronizing the mouth features of a speaking performer to a voice signal transmitted by the performer, comprising:
-
a processor; and a memory storage device for storing a program module; the processor, responsive to instructions from the program module, being operative to; sample the voice signal; break the voice signal into a number of frames; apply a windowing technique to each of the frames; apply a linear predictive coding technique to each frame to obtain a number of reflection coefficients and a gain coefficient for each frame; transform the reflection coefficients into Cepstral coefficients; determine a lip position for each frame that corresponds to the Cepstral coefficients for each frame; adjust the gain of certain frames of the voice signal so that a mouth of the performer fully opens and fully closes within a predetermined number of frames; and determine the mouth features corresponding to each frame using the gain and lip position for each frame. - View Dependent Claims (17, 18, 19, 20)
-
Specification