Method of and apparatus for animation, driven by an audio signal, of a synthesized model of a human face

US 6,665,643 B1
Filed: 09/28/1999
Issued: 12/16/2003
Est. Priority Date: 10/07/1998
Status: Expired due to Term

First Claim

Patent Images

1. A method for the animation, driven by an audio signal, of a synthesized model of a human face, wherein a driving signal is converted into phonetic information readable by a machine and such phonetic information comprising individual phonetic information items is directly transformed into a predetermined group of parameters representative of elementary deformations to be directly applied to such a model through the following sequence of operations:

(a) sequentially and directly associating said individual phonetic information items, one by one with respective information items in the form of visemes representative of a corresponding position of a mouth of the model, the visemes being chosen from within a set that includes visemes independent of a language of the driving audio signal and visemes specific for such a language;

(b) splitting each viseme into a plurality of macroparameters that characterize shapes and positions of the lip region and of the jaw in the model, and associationg said plurality of macroparameters of a given viseme with intensity values representative of displacements from a neutral position and chosen within a interval determined in an initialization phase so as to ensure a good maturalness of the animated model; and

(c) splitting said plurality of macroparameters into said predetermined group of parameters representative of deformations to be applied to the model, said predetermined group of parameters being chosen within a group of standard facial animation parameters relating to the mouth movements, each of said parameters being associated with intensity values which depend on the intensity values of the macroparameters and being chosen within an interval designed to guarantee the naturalness of the animated model, said group of visemes independent of the language and said group of standard facial animation parameters being the visemes and the facial animation parameters respectively defined by an ISO-IEC standard.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and an apparatus for the animation, driven by an audio signal, of a synthesised human face model are described, that allow the animation of any model complying with the ISO/IEC standard 14496 (“MPEG-4 standard”). The concerned phonemes are derived from the audio signal, and the corresponding visemes are identified within a set comprising both visemes defined by the standard and visemes typical of the language. Visemes are split into macroparameters that define shape and positions of the mouth and jaw of the model and that are associated to values indicating a difference from a neutral position. Such macroparameters are then transformed into face animaton parameters complying with the standard, the values of which define the deformation to be applied to the model in order to achieve animation.

47 Citations

View as Search Results

6 Claims

1. A method for the animation, driven by an audio signal, of a synthesized model of a human face, wherein a driving signal is converted into phonetic information readable by a machine and such phonetic information comprising individual phonetic information items is directly transformed into a predetermined group of parameters representative of elementary deformations to be directly applied to such a model through the following sequence of operations:
- (a) sequentially and directly associating said individual phonetic information items, one by one with respective information items in the form of visemes representative of a corresponding position of a mouth of the model, the visemes being chosen from within a set that includes visemes independent of a language of the driving audio signal and visemes specific for such a language;
  
  (b) splitting each viseme into a plurality of macroparameters that characterize shapes and positions of the lip region and of the jaw in the model, and associationg said plurality of macroparameters of a given viseme with intensity values representative of displacements from a neutral position and chosen within a interval determined in an initialization phase so as to ensure a good maturalness of the animated model; and
  
  (c) splitting said plurality of macroparameters into said predetermined group of parameters representative of deformations to be applied to the model, said predetermined group of parameters being chosen within a group of standard facial animation parameters relating to the mouth movements, each of said parameters being associated with intensity values which depend on the intensity values of the macroparameters and being chosen within an interval designed to guarantee the naturalness of the animated model, said group of visemes independent of the language and said group of standard facial animation parameters being the visemes and the facial animation parameters respectively defined by an ISO-IEC standard.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method according to claim 1 wherein said macroparameters represent:
3. The method according to claim 2 wherein said visemes specific for the language are visemes associated to phonetic information relating to stressed vowels and the intensities of the macroparameters for the language-specific visemes specific are chosen within the following intervals:
4. The method according to claim 3 wherein for splitting the macroparameters the following facial animation parameters FAP are used:
5. The method according to claim 4 wherein the facial animation parameters are associated with the following intensity values:

6. An apparatus for the animation, driven by an audio signal, of a synthesized model of human face, including:
- means (SY) for generating phonetic information comprising streams of individual phonetic information items representative of the driving audio signal, readable by a machine;
  
  means (CFP) for sequentially converting said streams of individual phonetic information items into a predetermined group of parameters representative of elementary deformations to be directly applied to said model, said conversion means (CFP) being arranged for;
  
  sequentially and directly associating said individual phonetic information items, one by one, with respective information items in the form of visemes representative of a corresponding mouth position in the synthesized model, the visemes being read from a memory containing visemes independent of the language of the driving audio signal and visemes specific for such a language;
  
  splitting each viseme into a plurality of macroparameters that characterize mouth shape and positions of lips and jaw in the model;
  
  associating said plurality of macroparameters of a given viseme with intensity values representative of displacements from a neutral position and chosen within a given interval in an initialization phase, so as to guarantee a good naturalness of the animated model; and
  
  splitting said plurality of macroparameters into said predetermined group of parameters representative of deformations to be applied to such a model, said predetermined group of parameters being chosen within a group of standard facial animation parameters relating to mouth movements, each of said parameters being associated with intensity values which depend on the intensity values of the macroparameters and chosen within an interval so designed as to guarantee the naturalness of the animated model; and
  
  means (AF) for directly applying the parameters to the model, under the control of the means for generating the phonetic information, said group of visemes independent of the language and said group of standard facial animation parameters being the visemes and facial animation parameters, respectively defined by the ISO-IEC standard 14496.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Telecom Italia S.p.A.
Original Assignee
Telecom Italia Lab SpA (Telecom Italia S.p.A.)
Inventors
Lande, Claudio, Quaglia, Mauro
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Azad, Abul K.

Application Number

US09/407,027
Time in Patent Office

1,540 Days
Field of Search

704/258, 704/260, 704/263, 704/269, 704/270, 704/276, 704/277, 345/473, 345/474, 345/475
US Class Current

704/266
CPC Class Codes

G06T 9/001 Model-based coding, e.g. wi...

G10L 2021/105 Synthesis of the lips movem...

Method of and apparatus for animation, driven by an audio signal, of a synthesized model of a human face

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

47 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Method of and apparatus for animation, driven by an audio signal, of a synthesized model of a human face

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

47 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links