Device and method for prosody generation at visual synthesis

US 6,389,396 B1
Filed: 11/29/1999
Issued: 05/14/2002
Est. Priority Date: 03/25/1997
Status: Expired due to Fees

First Claim

Patent Images

1. A device for prosody generation and visual synthesis, comprising:

capturing means for capturing sounds and face movement patterns of a physiognomy of a first face during a speech, wherein the face movement patterns include a position and displacement of selected points on the first face;

storing means for storing the captured sounds and face movement patterns;

reproducing means for reproducing the stored sounds and face movements patterns of the first face on a second face; and

amplifying means for amplifying the face movement patterns reproduced on the second face, based on stresses of the speech of the first face.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A device for prosody generation at visual synthesis. A number of half-syllables are stored together with registered movement patterns in a face. When synthesizing speech, a number of half-syllables are put together into words and sentences. The words and sentences are given a stress and pattern of intonation corresponding to the intended language. In the face, a number of points and their movement patterns are further registered. In connection with the generation of words and sentences, the movement patterns of the different points are amplified depending on a given stress and sentence intonation. The given movement patterns are after that applied to a model, which is applied to a real face at which a life-like animation is obtained, at for instance a translation of a person'"'"'s speech in a first language to a second language.

34 Citations

View as Search Results

15 Claims

1. A device for prosody generation and visual synthesis, comprising:
- capturing means for capturing sounds and face movement patterns of a physiognomy of a first face during a speech, wherein the face movement patterns include a position and displacement of selected points on the first face;
  
  storing means for storing the captured sounds and face movement patterns;
  
  reproducing means for reproducing the stored sounds and face movements patterns of the first face on a second face; and
  
  amplifying means for amplifying the face movement patterns reproduced on the second face, based on stresses of the speech of the first face.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. A device according to claim 1, wherein the reproducing means comprises a mechanism configured to produce a physiognomy of the second face by concatenating a number of the recorded face movement patterns, wherein each face movement pattern corresponds to a neutral pronunciation of a half-syllable.
  - 3. A device according to claim 2, wherein the sounds and face movement patterns corresponding to half-syllables are stored in association in the storing means.
  - 4. A device according to claim 3, wherein the amplification means allots a maximum amplification to a vowel in the middle of the half-syllable and a minimum amplification to the ends of the half-syllable.
  - 5. A device according to claim 1 or 2, wherein the reproducing means, based on the amplifying means, is configured to output sounds and face movement patterns on the second face, reproducing the stresses of the speech of the first face.
  - 6. A device according to claim 1 or 2, comprising translation means for translating from a speech in a first language to a speech in a second language.
  - 7. A device according to claim 6, wherein the capturing means captures stresses of the speech in the first language and the reproducing means, based on an input from the translation means, is configured to reproduce the face movement patterns and the stresses of the first face in the speech of the second language on the second face.
  - 8. A device according to claim 1 or 2, wherein the reproducing means is configured to apply the sounds and face movement patterns of the first face to the second face, so that a three dimensional animation of the second face is produced.
  - 9. A device according to claim 8, wherein polygons instead of points are selected on the second face for applying the face movement patterns of the first face.

10. A method for prosody generation and visual synthesis using selected polygons on a second face, comprising:
- capturing sounds and face movement patterns corresponding to polyphones of a first face;
  
  recording speaking stresses of polyphones;
  
  amplifying the captured face movement patterns based on the recorded stresses of the polyphones;
  
  selecting points in the selected polygons on the second face;
  
  reproducing captured sounds and amplified face movement patterns of the first face onto the second face, wherein the points in the selected polygons are allocated a weighting which is influenced by the speaking stresses of the polyphones; and
  
  animating the second face by a movement of selected polygons according to the captured face movement patterns of the first face and reproducing the captured sounds so that a three-dimensional picture is created on the second face.
- View Dependent Claims (11, 12, 13, 14, 15)
- - 11. A method according to claim 10, wherein the weighting of the points in the selected polygons of the second face changes a displacement of the points of the second face.
  - 12. A method according to claim 10 or 11, comprising:
13. A method according to claim 12, comprising recording the face movement patterns of a group of persons.
14. A method according to claim 13, wherein the recording has a group of persons including men, women, and children.
15. A method according to claim 10 or 11, further comprising producing sounds for polyphones from a text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hanger Solutions, LLC (IP Investments Group LLC)
Original Assignee
Telia AB (Government of Norway)
Inventors
Lyberg, Bertil
Primary Examiner(s)
Dorvil, Richemond

Application Number

US09/381,632
Time in Patent Office

897 Days
Field of Search

704/270, 704/272, 704/276, 704/277, 704/278, 704/260, 704/258, 704/254, 704/257
US Class Current

704/258
CPC Class Codes

G06T 13/205   driven by audio data

G06T 13/40   of characters, e.g. humans,...

G10L 13/04   Details of speech synthesis...

G10L 13/07   Concatenation rules

G10L 2021/105   Synthesis of the lips movem...

Device and method for prosody generation at visual synthesis

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

34 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Device and method for prosody generation at visual synthesis

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

34 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links