Advanced tools for speech synchronized animation

US 5,689,618 A
Filed: 05/31/1995
Issued: 11/18/1997
Est. Priority Date: 02/19/1991
Status: Expired due to Term

First Claim

Patent Images

1. For a programmed computer having a memory and a real-time random access animation and vivification engine driver, an apparatus for providing a voice sound for a synactor for sound-animation synchronization, the apparatus comprising:

a voice synthesizer coupled to the computer for producing synthesizer phonemes, the voice synthesizer including;

first means for receiving speech samples derived from input audio data and for providing a sample speech signal representing the speech samples;

first segmentation means coupled to the first means for extracting from the sample speech signal the speech samples in accordance with a predetermined speech segmentation plan, the first segmentation means for providing constituent speech segments;

second means for receiving speech text and for providing a speech text signal representing the speech text;

second segmentation means coupled to the second means for segmenting the speech text signal to provide constituent text segments in accordance with the predetermined speech segmentation plan;

encoding means for encoding the constituent speech segments to provide encoded constituent speech segments; and

combining means for combining the encoded constituent speech segments to provide a speech signal representative of animated speech corresponding to the speech text where each of the constituent speech segments corresponds to at least one of the constituent text segments;

means for creating a voice reconciliation phoneme table, the voice reconciliation phoneme table including the synthesizer phonemes;

means for providing a synactor model phoneme table, the synactor model phoneme table including synactor phonemes for the voice sound of the synactor;

means for determining which of the synthesizer phonemes are unrecognized as compared to the synactor model phoneme table;

means for finding substitute phonemes from the synactor model phoneme table for the unrecognized phonemes in the voice reconciliation phoneme table;

means for creating a generic phoneme table, the generic phoneme table including recognized synthesizer phonemes from the voice reconciliation phoneme table and the substitute phonemes; and

means for using the generic phoneme table, the voice reconciliation phoneme table and the synactor model phoneme table to provide a runtime reconciled phocode table for using the voice synthesizer to provide voice sound for the synactor without modifying the synactor model phoneme table.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A random access animation user interface environment referred to as interFACE enabling a user to create and control animated lip-synchronized images or objects utilizing a personal computer for use in the users programs and products. A real-time random-access interface driver (RAVE) together with a descriptive authoring language (RAVEL) is used to provide synthesized actors ("synactors"). The synactors may represent real or imaginary persons or animated characters, objects or scenes. The synactors may be created and programmed to perform actions including speech which are not sequentially pre-stored records of previously enacted events. Furthermore, animation and sound synchronization may be produced automatically and in real-time. Sounds and visual images of a real or imaginary person or animated character associated with those sounds are input to a system and may be decomposed into constituent parts to produce fragmentary images and sounds. A set of characteristics is utilized to define a digital model of the motions and sounds of a particular synactor. The general purpose system is provided for random access and display of synactor images on a frame-by-frame basis, which is organized and synchronized with sound. Both synthetic speech and digitized recording may provide the speech for synactors.

Citations

6 Claims

1. For a programmed computer having a memory and a real-time random access animation and vivification engine driver, an apparatus for providing a voice sound for a synactor for sound-animation synchronization, the apparatus comprising:
- a voice synthesizer coupled to the computer for producing synthesizer phonemes, the voice synthesizer including;
  
  first means for receiving speech samples derived from input audio data and for providing a sample speech signal representing the speech samples;
  
  first segmentation means coupled to the first means for extracting from the sample speech signal the speech samples in accordance with a predetermined speech segmentation plan, the first segmentation means for providing constituent speech segments;
  
  second means for receiving speech text and for providing a speech text signal representing the speech text;
  
  second segmentation means coupled to the second means for segmenting the speech text signal to provide constituent text segments in accordance with the predetermined speech segmentation plan;
  
  encoding means for encoding the constituent speech segments to provide encoded constituent speech segments; and
  
  combining means for combining the encoded constituent speech segments to provide a speech signal representative of animated speech corresponding to the speech text where each of the constituent speech segments corresponds to at least one of the constituent text segments;
  
  means for creating a voice reconciliation phoneme table, the voice reconciliation phoneme table including the synthesizer phonemes;
  
  means for providing a synactor model phoneme table, the synactor model phoneme table including synactor phonemes for the voice sound of the synactor;
  
  means for determining which of the synthesizer phonemes are unrecognized as compared to the synactor model phoneme table;
  
  means for finding substitute phonemes from the synactor model phoneme table for the unrecognized phonemes in the voice reconciliation phoneme table;
  
  means for creating a generic phoneme table, the generic phoneme table including recognized synthesizer phonemes from the voice reconciliation phoneme table and the substitute phonemes; and
  
  means for using the generic phoneme table, the voice reconciliation phoneme table and the synactor model phoneme table to provide a runtime reconciled phocode table for using the voice synthesizer to provide voice sound for the synactor without modifying the synactor model phoneme table.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The apparatus of claim 1 further comprising:
    - input means coupled to the first and second means for providing the input audio data and the speech text;
      
      storage means coupled for storing the encoded constituent speech segments; and
      
      at least one predefined voice file for storing the encoded constituent segments in the storage means, the predefined voice file including a language library, recording library and a voice library.
  - 3. The apparatus of claim 1 wherein the first and second means are coupled to input means for providing the input audio data and the speech text to the first and second means.
  - 4. The apparatus of claim 3 further comprising storage means for storing the constituent speech segments.
  - 5. The apparatus of claim 4 further comprising a predefined voice file for storing the constituent speech segments with the storage means.
  - 6. The apparatus of claim 5 wherein the speech samples are input to the first means in a selected voice, the predefined voice file being identified as the speech samples for the selected voice of a selected person.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Activision Publishing Incorporated (Microsoft Corporation)
Original Assignee
Brightmail, Inc. (Gen Digital Inc.)
Inventors
Gasper, Elon, Wesley, Richard
Primary Examiner(s)
Tung, Kee M.

Application Number

US08/457,023
Time in Patent Office

902 Days
Field of Search

395/152, 395/154, 395/2.69, 395/2.85, 395/2, 395/2.1, 395/2.79, 364/410, 434/307 R
US Class Current

704/276
CPC Class Codes

G06T 13/205   driven by audio data

G10L 13/00   Speech synthesis; Text to s...

G10L 2021/105   Synthesis of the lips movem...

G11B 27/034   on discs G11B27/036, G11B27...

G11B 27/10   Indexing; Addressing; Timin...

G11B 27/34   Indicating arrangements in...

Y10S 345/956   Language driven animation

Y10S 345/957   Actor

Advanced tools for speech synchronized animation

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Advanced tools for speech synchronized animation

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links