Speech synthesis system by rule using phonemes as systhesis units

US 4,896,359 A
Filed: 05/17/1988
Issued: 01/23/1990
Est. Priority Date: 05/18/1987
Status: Expired due to Fees

First Claim

Patent Images

1. A speech synthesis system comprising:

code converter means (22) for accepting at an input terminal (21) text code comprising spelling, accent code and intonation code of a word, and producing therefrom a phonetic symbol for pronunciation (phoneme of speech) including a text string and aprosodic string for each phoneme of speech;

a feature vector table (24) including means for storing feature vector information comprising speech parameters for each phoneme, including a time duration period, pitch frequency pattern, formant frequency, formant bandwidth, strength of a voice source, and speech rate,wherein each of said speech parameters is defined by two target points (r₁ and r₂) during said time duration period, a value at each of the target points, and a connection curve between said two target point values,and wherein said said speech rate is defined for each phoneme by parameters of a speech rate adjustment curve including a start point (d₁), an end point (d₂) and a ratio of adjustment, stored in said feature vector table (24);

feature vector selection means (23) for selecting an address of said feature vector table (24) in accordance with each phonetic symbol input thereto from said code converter means (22);

a speech rate table generator means (25) for calculating, in response to speech rate parameters stored in said address selected from said feature vector table (24) by said selection means (23), a relationship between relative time which defines a speech parameter and absolute time, according to said speech rate adjustment curve;

a speech rate table (26) for storing the output of said speech rate table generator means (25) for successive short increments of time defined by said generator means (25);

speech synthesizing parameter calculation means (27) for calculating, from feature vector information stored in said feature vector table (24) and speech rate information stored in said speech rate table (26), an instant value of a speech parameter at each increment of time defined in said speech rate table (26);

speech synthesizer means (28) including voice sources and filters for generating a synthesized voice output by actuating voice source and filter combinations according to said speech parameter values calculated by said speech synthesizer parameter calculation means (27); and

an output terminal (29) coupled with an output of said speech synthesizer means (28) for providing said synthesized speech.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech synthesizer that synthesizes speech by actuating a voice source and a filter which processes output of the voice source according to speech parameters in each successive short interval of time according to feature vectors which include formant frequencies, formant bandwidth, speech rate and so on. Each feature vector, or speech parameter is defined by two target points (r₁, r₂), and a value at each target point together with a connection curve between target points. A speech rate is defined by a speech rate curve which defines elongation or shortening of the speech rate, by start point (d₁) of elongation (or shorteninng), end point (d₂), and elongation ratio between d₁ and d₂. The ratios between the relative time of each speech parameter and absolute time are preliminarily calculated according to the speech rate table in each predetermined short interval.

Citations

4 Claims

1. A speech synthesis system comprising:
- code converter means (22) for accepting at an input terminal (21) text code comprising spelling, accent code and intonation code of a word, and producing therefrom a phonetic symbol for pronunciation (phoneme of speech) including a text string and aprosodic string for each phoneme of speech;
  
  a feature vector table (24) including means for storing feature vector information comprising speech parameters for each phoneme, including a time duration period, pitch frequency pattern, formant frequency, formant bandwidth, strength of a voice source, and speech rate,wherein each of said speech parameters is defined by two target points (r₁ and r₂) during said time duration period, a value at each of the target points, and a connection curve between said two target point values,and wherein said said speech rate is defined for each phoneme by parameters of a speech rate adjustment curve including a start point (d₁), an end point (d₂) and a ratio of adjustment, stored in said feature vector table (24);
  
  feature vector selection means (23) for selecting an address of said feature vector table (24) in accordance with each phonetic symbol input thereto from said code converter means (22);
  
  a speech rate table generator means (25) for calculating, in response to speech rate parameters stored in said address selected from said feature vector table (24) by said selection means (23), a relationship between relative time which defines a speech parameter and absolute time, according to said speech rate adjustment curve;
  
  a speech rate table (26) for storing the output of said speech rate table generator means (25) for successive short increments of time defined by said generator means (25);
  
  speech synthesizing parameter calculation means (27) for calculating, from feature vector information stored in said feature vector table (24) and speech rate information stored in said speech rate table (26), an instant value of a speech parameter at each increment of time defined in said speech rate table (26);
  
  speech synthesizer means (28) including voice sources and filters for generating a synthesized voice output by actuating voice source and filter combinations according to said speech parameter values calculated by said speech synthesizer parameter calculation means (27); and
  
  an output terminal (29) coupled with an output of said speech synthesizer means (28) for providing said synthesized speech.
- View Dependent Claims (2, 3, 4)
- - 2. A speech synthesis system according to claim 1, wherein said connection curve between said two target point values is linear.
  - 3. A speech synthesis system according to claim 1, wherein target points (r₁, r₂) of a speech parameter differ from target points of other speech parameters in a phoneme.
  - 4. A speech synthesis system according to claim 1, wherein said start point (d₁) and end point (d₂) differ from target points (r₁, r₂) of each speech parameter.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kokusai Denshi Denwa Company Limited (KDDI Corporation)
Original Assignee
Kokusai Denshi Denwa Company Limited (KDDI Corporation)
Inventors
Higuchi, Norio, Yamamoto, Seiichi, Shimizu, Toru
Primary Examiner(s)
Clark, David L.
Assistant Examiner(s)
Merecki, John A.

Application Number

US07/196,169
Time in Patent Office

616 Days
Field of Search

381/51-53, 381/36-40, 364/53.5
US Class Current

704/260
CPC Class Codes

G10L 13/07 Concatenation rules

Speech synthesis system by rule using phonemes as systhesis units

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

4 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesis system by rule using phonemes as systhesis units

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

4 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links