Coding of facial animation parameters (FAPs) for transmission of synthetic talking head video over band limited channels
First Claim
1. A method of coding facial animation parameters (FAPs) for synthetic video, comprising:
- generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of spatially correlated parameters in an n-dimensional space;
transforming each said FAP frame from the n-dimensional space into an m-dimensional subspace where m<
n to reduce the intra-frame spatial correlation; and
coding the temporal sequence of m-dimensional FAP frames into a bitstream.
5 Assignments
0 Petitions
Accused Products
Abstract
A FAP coding technique that realizes enough coding gain to transmit multiple synthetic talking heads over a band limited channel without introducing perceptible artifacts into the reconstructed synthetic talking heads. This is accomplished by exploiting the spatial correlation of each FAP frame and/or the temporal correlation of the sequence of FAP frames. To remove intra-frame correlation, each FAP frame is transformed prior to segmentation from the n-dimensional space into an m-dimensional subspace where m<n using an energy compaction transform. To remove inter-frame redundancy, the sequence is segmented and each parameter vector is transform coded to decorrelate the vector.
-
Citations
22 Claims
-
1. A method of coding facial animation parameters (FAPs) for synthetic video, comprising:
-
generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of spatially correlated parameters in an n-dimensional space; transforming each said FAP frame from the n-dimensional space into an m-dimensional subspace where m<
n to reduce the intra-frame spatial correlation; andcoding the temporal sequence of m-dimensional FAP frames into a bitstream. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of coding facial animation parameters (FAPS) for synthetic video, comprising:
-
generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of parameters in an n-dimensional space; segmenting the temporal sequence into length L blocks of FAP frames to define n length L parameter vectors that exhibit significant inter-frame temporal correlation; transform coding each said parameter vector into L transform coefficients to reduce the inter-frame temporal correlation; and coding the transform coefficients into a bitstream. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A method of coding facial animation parameters (FAPs) for synthetic video, comprising:
-
generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of parameters in an n-dimensional space; segmenting the temporal sequence into length L segments of FAP frames to define n length L parameter vectors that exhibit significant inter-frame temporal correlation; using a discrete cosine transform (DCT) to transform each said parameter vector into L transform coefficients to reduce the inter-frame temporal correlation thereby achieving a measure of coding gain, said first transform coefficient is a DC coefficient and the remaining transform coefficients are AC coefficients; for each said parameter vector, using a one-step unweighted predictive code to code and quantize the DC coefficient from block-to-block; quantizing the AC coefficients in the current block; run-length coding the zero valued AC coefficients; Huffman coding the quantized DC coefficients, non-zero quantized AC coefficients and the run-length codes; multiplexing the entropy coded DC and AC coefficients and run-length codes into a bitstream. - View Dependent Claims (12, 13)
-
-
14. A method of coding facial animation parameters (FAPs) for synthetic video, comprising:
-
generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of spatially correlated parameters in an n-dimensional space; transforming each said FAP frame from the n-dimensional space into an m-dimensional subspace where m<
n to reduce the intra-frame spatial correlation and achieve a measure of coding gain;segmenting the temporal sequence into length L blocks of FAP frames to define m length L parameter vectors; transform coding each said parameter vector into L transform coefficients to reduce the inter-frame temporal correlation and achieve additional coding gain; and coding the transform coefficients into a bit stream. - View Dependent Claims (15, 16, 17)
-
-
18. A method of coding facial animation parameters (FAPs) for transmitting synthetic video over a band limited channel, comprising:
-
generating a plurality of talking head FAP sequences for a single video signal, each said FAP frame representing a time sample of spatially correlated parameters in an n-dimensional space, the uncoded bandwidth of each said sequence being less than the bandwidth of said channel with the total uncoded bandwidth of said plurality of sequences being greater than the channel bandwidth; transforming each said FAP frame for each said sequence from the n-dimensional space into an m-dimensional subspace where m<
n to reduce the intra-frame spatial correlation and achieve a measure of coding gain;segmenting the temporal sequence into length L blocks of FAP frames to define m length L parameter vectors; transform coding each said parameter vector into L transform coefficients to reduce the inter-frame temporal correlation and achieve additional coding gain; and coding the transform coefficients for all said sequences into a video bitstream having a total coded bandwidth that is less than the channel bandwidth. - View Dependent Claims (19, 20, 21, 22)
-
Specification