Coding of facial animation parameters (FAPs) for transmission of synthetic talking head video over band limited channels

US 6,069,631 A
Filed: 01/16/1998
Issued: 05/30/2000
Est. Priority Date: 02/13/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method of coding facial animation parameters (FAPs) for synthetic video, comprising:

generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of spatially correlated parameters in an n-dimensional space;

transforming each said FAP frame from the n-dimensional space into an m-dimensional subspace where m<

n to reduce the intra-frame spatial correlation; and

coding the temporal sequence of m-dimensional FAP frames into a bitstream.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A FAP coding technique that realizes enough coding gain to transmit multiple synthetic talking heads over a band limited channel without introducing perceptible artifacts into the reconstructed synthetic talking heads. This is accomplished by exploiting the spatial correlation of each FAP frame and/or the temporal correlation of the sequence of FAP frames. To remove intra-frame correlation, each FAP frame is transformed prior to segmentation from the n-dimensional space into an m-dimensional subspace where m<n using an energy compaction transform. To remove inter-frame redundancy, the sequence is segmented and each parameter vector is transform coded to decorrelate the vector.

Citations

22 Claims

1. A method of coding facial animation parameters (FAPs) for synthetic video, comprising:
- generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of spatially correlated parameters in an n-dimensional space;
  
  transforming each said FAP frame from the n-dimensional space into an m-dimensional subspace where m<
  
  n to reduce the intra-frame spatial correlation; and
  
  coding the temporal sequence of m-dimensional FAP frames into a bitstream.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein said FAP frames are transformed using a Karhunen Loeve Transform (KLT) that maps the FAP frames into the m-dimensional space in which the transformed parameters are orthogonal to each other.
  - 3. The method of claim 1, wherein said m-dimensional subspace comprises m basis functions that each have physical meaning in the context of the synthetic video signal.
  - 4. The method of claim 3, wherein said m basis functions are selected from a set of action units (AUs) in the facial action coding system (FACS).
  - 5. The method of claim 1, wherein said sequence of FAP frames is subdivided into a plurality of subsequences corresponding to different groups of FAPs that exhibit strong local spatial correlation, each said subsequence being transformed into a reduced dimension subspace using a transform tailored to the corresponding FAP group to further reduce the overall intra-frame spatial correlation.

6. A method of coding facial animation parameters (FAPS) for synthetic video, comprising:
- generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of parameters in an n-dimensional space;
  
  segmenting the temporal sequence into length L blocks of FAP frames to define n length L parameter vectors that exhibit significant inter-frame temporal correlation;
  
  transform coding each said parameter vector into L transform coefficients to reduce the inter-frame temporal correlation; and
  
  coding the transform coefficients into a bitstream.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The method of claim 6, wherein said parameter vectors are transform coded using a Discrete Cosine Transform (DCT).
  - 8. The method of claim 6, wherein for each said parameter vector the first transform coefficient is a DC coefficient and the remaining transform coefficients are AC coefficients, each said parameter vector'"'"'s transform coefficients being coded by:
    - predictive coding the DC coefficients from block-to-block;
      
      quantizing the AC coefficients in the current block;
      
      run-length coding the zero valued AC coefficients; and
      
      entropy coding the quantized DC coefficients, quantized AC coefficients and run-length codes.
  - 9. The method of claim 8, wherein the DC coefficients are predictive coded by:
    - subtracting a predicted value from the DC coefficient to generate a residual value;
      
      quantizing the residual value;
      
      inverse quantizing the quantized residual value to generate a reconstructed residual value;
      
      summing the reconstructed residual value with the predicted value to generate the predicted value for the next DC coefficient.
  - 10. The method of claim 8, wherein the coefficients are entropy coded using Huffman codes.

11. A method of coding facial animation parameters (FAPs) for synthetic video, comprising:
- generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of parameters in an n-dimensional space;
  
  segmenting the temporal sequence into length L segments of FAP frames to define n length L parameter vectors that exhibit significant inter-frame temporal correlation;
  
  using a discrete cosine transform (DCT) to transform each said parameter vector into L transform coefficients to reduce the inter-frame temporal correlation thereby achieving a measure of coding gain, said first transform coefficient is a DC coefficient and the remaining transform coefficients are AC coefficients;
  
  for each said parameter vector,using a one-step unweighted predictive code to code and quantize the DC coefficient from block-to-block;
  
  quantizing the AC coefficients in the current block;
  
  run-length coding the zero valued AC coefficients;
  
  Huffman coding the quantized DC coefficients, non-zero quantized AC coefficients and the run-length codes;
  
  multiplexing the entropy coded DC and AC coefficients and run-length codes into a bitstream.
- View Dependent Claims (12, 13)
- - 12. The method of claim 11, further comprising:
    - transforming each said FAP frame in the temporal sequence from the n-dimensional space into an m-dimensional subspace where m<
      
      n to reduce the intra-frame spatial correlation and achieve additional coding gain, said m-dimensional sequence being segmented into m length L parameter vectors.
  - 13. The method of claim 12, wherein said sequence of FAP frames is subdivided into a plurality of subsequences corresponding to different groups of FAPs that exhibit strong local spatial correlation, each said subsequence being transformed into a reduced dimension subspace using a transform tailored to the corresponding FAP group to further reduce the overall intra-frame spatial correlation.

14. A method of coding facial animation parameters (FAPs) for synthetic video, comprising:
- generating a temporal sequence of FAP frames for a synthetic video signal, each said frame representing a time sample of spatially correlated parameters in an n-dimensional space;
  
  transforming each said FAP frame from the n-dimensional space into an m-dimensional subspace where m<
  
  n to reduce the intra-frame spatial correlation and achieve a measure of coding gain;
  
  segmenting the temporal sequence into length L blocks of FAP frames to define m length L parameter vectors;
  
  transform coding each said parameter vector into L transform coefficients to reduce the inter-frame temporal correlation and achieve additional coding gain; and
  
  coding the transform coefficients into a bit stream.
- View Dependent Claims (15, 16, 17)
- - 15. The method of claim 14, wherein said FAP frames are transformed using a Karhunen Loeve Transform (KLT) that maps the FAP frames into the m-dimensional space in which the transformed parameters are orthogonal to each other.
  - 16. The method of claim 14, wherein for each said parameter vector the first transform coefficient is a DC coefficient and the remaining transform coefficients are AC coefficients, each said parameter vector'"'"'s transform coefficients being coded by:
    - predictive coding the DC coefficients from block-to-block;
      
      quantizing the AC coefficients in the current block;
      
      run-length coding the zero valued AC coefficients; and
      
      entropy coding the quantized DC coefficients, quantized AC coefficients and run-length codes.
  - 17. The method of claim 16, wherein the DC coefficients are predictive coded by:
    - subtracting a predicted value from the DC coefficient to generate a residual value;
      
      quantizing the residual value;
      
      inverse quantizing the quantized residual value to generate a reconstructed residual value;
      
      summing the reconstructed residual value with the predicted value to generate the predicted value for the next DC coefficient.

18. A method of coding facial animation parameters (FAPs) for transmitting synthetic video over a band limited channel, comprising:
- generating a plurality of talking head FAP sequences for a single video signal, each said FAP frame representing a time sample of spatially correlated parameters in an n-dimensional space, the uncoded bandwidth of each said sequence being less than the bandwidth of said channel with the total uncoded bandwidth of said plurality of sequences being greater than the channel bandwidth;
  
  transforming each said FAP frame for each said sequence from the n-dimensional space into an m-dimensional subspace where m<
  
  n to reduce the intra-frame spatial correlation and achieve a measure of coding gain;
  
  segmenting the temporal sequence into length L blocks of FAP frames to define m length L parameter vectors;
  
  transform coding each said parameter vector into L transform coefficients to reduce the inter-frame temporal correlation and achieve additional coding gain; and
  
  coding the transform coefficients for all said sequences into a video bitstream having a total coded bandwidth that is less than the channel bandwidth.
- View Dependent Claims (19, 20, 21, 22)
- - 19. The method of claim 18, wherein said FAP frames are transformed using a Karhunen Loeve Transform (KLT) that maps the FAP frames into the m-dimensional space in which the transformed parameters are orthogonal to each other.
  - 20. The method of claim 18, wherein said sequence of FAP frames is subdivided into a plurality of subsequences corresponding to different groups of FAPs that exhibit strong local spatial correlation, each said subsequence being transformed into a reduced dimension subspace using a transform tailored to the corresponding FAP group to further reduce the overall intra-frame spatial correlation.
  - 21. The method of claim 18, wherein for each said parameter vector the first transform coefficient is a DC coefficient and the remaining transform coefficients are AC coefficients, each said parameter vector'"'"'s transform coefficients being coded by:
    - predictive coding the DC coefficients from block-to-block;
      
      quantizing the AC coefficients in the current block;
      
      run-length coding the zero valued AC coefficients; and
      
      entropy coding the quantized DC coefficients, quantized AC coefficients and run-length codes.
  - 22. The method of claim 21, wherein the DC coefficients are predictive coded by:
    - subtracting a predicted value from the DC coefficient to generate a residual value;
      
      quantizing the residual value;
      
      inverse quantizing the quantized residual value to generate a reconstructed residual value;
      
      summing the reconstructed residual value with the predicted value to generate the predicted value for the next DC coefficient.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Rockwell Science Center LLC
Inventors
Tao, Hai, Huang, Thomas S., Chen, Homer H., Wu, Wei
Primary Examiner(s)
Nguyen, Phu K.

Application Number

US09/007,978
Time in Patent Office

865 Days
Field of Search

345/418, 345/419, 345/473, 345/474, 345/475, 345/113, 345/115
US Class Current

345/418
CPC Class Codes

A63F 2300/6607   for animating game characte...

G06T 13/00   Animation

G06T 9/001   Model-based coding, e.g. wi...

H04N 19/27   involving both synthetic an...

H04N 19/30   using hierarchical techniqu...

H04N 19/61   in combination with predict...

H04N 19/619   the transform being operate...

Coding of facial animation parameters (FAPs) for transmission of synthetic talking head video over band limited channels

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Coding of facial animation parameters (FAPs) for transmission of synthetic talking head video over band limited channels

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links