×

Speech animation and inflection system

  • US 5,278,943 A
  • Filed: 05/08/1992
  • Issued: 01/11/1994
  • Est. Priority Date: 03/23/1990
  • Status: Expired due to Term
First Claim
Patent Images

1. Apparatus for speech animation of desired text, comprising:

  • first input means for receiving speech samples derived from input audio data and for providing a sample speech signal representing said speech samples, said input speech samples being in the voice of a selected person;

    first segmentation means coupled to said input means for extracting constituent speech segments in accordance with a predetermined speech segmentation plan from said sample speech signal;

    encoding means for digitally encoding said constituent speech segments;

    second input means for receiving and encoding desired speech text;

    second segmentation means, coupled to said second input means and responsive to desired speech text for segmenting said desired speech text into a plurality of constituent text segments in accordance with said predetermined segmentation plan;

    combining means for combining a plurality of said encoded constituent speech segments for providing a digital speech signal representative of desired animated speech corresponding to said desired speech text, said digital speech signal being representative of desired animated speech in the voice of said selected person, each of said plurality of encoded constituent speech segments corresponding to at least one of said plurality of constituent text segments; and

    storage means for storing said digitally encoded constituent speech segments in at least one predefined voice reference file, said predefined voice reference file comprises a language library for storing predefined sets of language rules associated with a selected language, a recording library for storing recorded speech sequences in said selected language for said selected person, a voice library for storing said encoded constituent speech segments in said selected language for said selected person, whereby a separate predefined voice reference file is defined and identified for each said selected person;

    one of said language libraries being defined for each of a plurality of selectable languages, each said language library being accessed by each said voice reference file associated with a selected language, each said language file including;

    a set of language segmentation rules defined for said selected language;

    a set of prosody rules defined in accordance with said language segmentation rules for said selected language;

    a set of text segmentation rules defined in accordance with said language segmentation rules for said selected language; and

    a set of resynthesis configuration parameters for configuring said combining means for said selected language.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×