Speech synthesis method and speech synthesizer

US 7,251,601 B2
Filed: 03/21/2002
Issued: 07/31/2007
Est. Priority Date: 03/26/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A speech synthesis method comprising:

storing a plurality of formant parameter groups each including a number of formant parameters in a storage in units of a synthesis unit, the formant parameters representing a formant frequency, a formant phase and a windowing function;

selecting predetermined formant parameters from the formant parameters stored in the storage according to a phoneme symbol string;

generating a plurality of sine waves based on formant frequencies and formant phases corresponding to the formant parameters selected;

multiplying the sine waves by the windowing functions corresponding to the selected formant parameters, respectively, to generate a plurality of formant waveforms each having a characteristic of one formant;

adding the formant waveforms to generate a pitch waveform having characteristics of a plurality of formants; and

superposing pitch waveforms each corresponding to the pitch waveform according to a pitch period to generate a speech signal.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech synthesis method comprises selecting a predetermined formant parameters from formant parameters according to a pitch pattern, phoneme duration, and phoneme symbol string, generating a plurality of sine waves based on formant frequency and formant phase of the formant parameters selected, multiplying the sine waves by windowing functions of the selected formant parameters, respectively, to generate a plurality of formant waveforms, adding the formant waveforms to generate a plurality of pitch waveforms, and superposing the pitch waveforms according to a pitch period to generate a speech signal.

17 Citations

View as Search Results

20 Claims

1. A speech synthesis method comprising:
- storing a plurality of formant parameter groups each including a number of formant parameters in a storage in units of a synthesis unit, the formant parameters representing a formant frequency, a formant phase and a windowing function;
  
  selecting predetermined formant parameters from the formant parameters stored in the storage according to a phoneme symbol string;
  
  generating a plurality of sine waves based on formant frequencies and formant phases corresponding to the formant parameters selected;
  
  multiplying the sine waves by the windowing functions corresponding to the selected formant parameters, respectively, to generate a plurality of formant waveforms each having a characteristic of one formant;
  
  adding the formant waveforms to generate a pitch waveform having characteristics of a plurality of formants; and
  
  superposing pitch waveforms each corresponding to the pitch waveform according to a pitch period to generate a speech signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. A speech synthesis method as defined in claim 1, wherein the formant waveform y (t) is expressed by the following equation:
    - y(t)=w(t)*sin(ω
      
      t+φ
      
      )where the formant frequency is ω
      
      , the formant phase φ and
      
      the windowing functions w(t).
  - 3. A speech synthesis method as defined in claim 1, which includes storing weighting factors in the storage and adding basis functions weighted by the weighting factors to generate the windowing functions.
  - 4. A speech synthesis method as defined in claim 1, which includes changing at least one of power of at least one of the formant waveforms, shape of at least one of the windowing functions, position of at least one of the windowing functions and at least one of the formant frequencies according to the pitch period.
  - 5. A speech synthesis method as defined in claim 4, wherein at least one of power of at least one of the formant waveforms, shape of at least one of the windowing functions, position of at least one of the windowing functions and at least one of the formant frequencies is changed every phoneme, every frame or every formant number.
  - 6. A speech synthesis method as defined in claim 1, which includes changing at least one of power of at least one of the formant waveforms, shape of at least one of the windowing functions, position of at least one of the windowing functions and at least one of the formant frequencies according to a kind of at least preceding phoneme or following phoneme.
  - 7. A speech synthesis method as defined in claim 1, which includes changing at least one of power of at least one of the formant waveforms, shape of at least one of the windowing functions, position of at least one of the windowing functions and at least one of the formant frequencies according to information of given voice variety.
  - 8. A speech synthesis method as defined in claim 1, which includes changing at least one of power of at least one of the formant waveforms, at least one of the formant frequencies, shape of at least one of the windowing functions, phase of at least one of the sine waves and position of at least one of the windowing functions according to at least one of power of at least one of the formant waveforms, at least one of the formant frequencies, shape of at least one of the windowing functions, phase of at least one of the sine waves and position of at least one of the windowing functions of a corresponding formant of at least a preceding pitch waveform or a following pitch waveform.
  - 9. A speech synthesis method as defined in claim 1, which includes changing at least one of power of at least one of the formant waveforms, at least one of the formant frequencies, shape of at least one of the windowing functions, phase of at least one of the sine waves and position of at least one of the windowing functions according to presence of a corresponding formant of at least a preceding pitch waveform or a following pitch waveform.
  - 10. A speech synthesis method as defined in claim 1, which includes smoothing selectively the formant frequencies, formant phases, and windowing functions.

11. A speech synthesizer supplied with a pitch pattern, phoneme duration and phoneme symbol string, comprising:
- a pitch mark generator configured to generate pitch marks referring to the pitch pattern and phoneme duration;
  
  a pitch waveform generator configured to generate pitch waveforms corresponding to the pitch marks, referring to the phoneme symbol string;
  
  a waveform superposition device configured to superpose the pitch waveforms on the pitch marks according to a pitch period to generate a voiced speech signal;
  
  a unvoiced speech generator configured to generate an unvoiced speech;
  
  an adder configured to add the voiced speech and the unvoiced speech to generate a synthesized speech,the pitch waveform generator including;
  
  a storage configured to store a plurality of formant parameter groups each including a plurality of formant parameters in units of a synthesis unit, the formant parameters representing a formant frequency, a formant phase and a windowing function,a parameter selector configured to select the formant parameters for one frame corresponding to the pitch marks from the storage referring to the phoneme symbol string,a plurality of sine wave generators configured to generate a plurality of sine waves according to formant frequencies and formant phases corresponding to the selected formant parameters,a multiplier configured to multiply the sine waves by the windowing functions of the selected formant parameters to generate a plurality of formant waveforms each having a characteristic of one formant,an adder configured to add the formant waveforms to generate a pitch waveform having characteristics of a plurality of formants.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
- - 12. A speech synthesizer as defined in claim 11, wherein the windowing functions are stored in the storage.
  - 13. A speech synthesizer as defined in claim 11, wherein the storage stores weighting factors of the windowing functions, and which comprises a windowing function generator configured to generate the windowing functions by adding basis functions weighted by the weighting factors.
  - 14. A speech synthesizer as defined in claim 11, which includes a parameter transformer configured to transform the selected formant parameters according to the pitch period.
  - 15. A speech synthesizer as defined in claim 14, wherein the parameter transformer transforms the selected format parameters every phoneme, every frame or every formant number.
  - 16. A speech synthesizer as defined in claim 11, which includes a parameter transformer configured to transform the selected formant parameters according to information of a preceding phoneme or a following phoneme.
  - 17. A speech synthesizer as defined in claim 11, which includes a parameter transformer configured to transform the selected formant parameters according to given voice variety.
  - 18. A speech synthesizer as defined in claim 11, which includes a parameter smoothing device configured to smooth the selected formant parameters that vary in time.

19. A speech synthesis program recorded on a computer readable medium, the program comprising:
- means for instructing a computer to store a number of formant parameters in a storage, the formant parameters representing a formant frequency, a formant phase and a windowing function;
  
  means for instructing the computer to select predetermined formant parameters from the formant parameters stored in the storage according to a phoneme symbol string;
  
  means for instructing the computer to generate a plurality of sine waves based on formant frequencies and formant phases corresponding to the formant parameters selected;
  
  means for instructing the computer to multiply the sine waves by the windowing functions corresponding to the selected formant parameters, respectively, to generate a plurality of formant waveforms each having a characteristic of one formant;
  
  means for instructing the computer to add the formant waveforms to generate a pitch waveform having characteristics of a plurality of formants; and
  
  means for instructing the computer to superpose pitch waveforms each corresponding to the pitch waveform according to a pitch period to generate a speech signal.
- View Dependent Claims (20)
- - 20. A speech synthesis program as defined in claim 19, which includes means for instructing the computer to add basis functions weighted by the weighting factors to generate the windowing functions.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Original Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Inventors
Akamine, Masami, Kagoshima, Takehiko
Primary Examiner(s)
Azad; Abul K.

Application Number

US10/101,689
Publication Number

US 20020138253A1
Time in Patent Office

1,958 Days
Field of Search

None
US Class Current

704/268
CPC Class Codes

G10L 13/04 Details of speech synthesis...

G10L 25/27 characterised by the analys...

Speech synthesis method and speech synthesizer

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

17 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesis method and speech synthesizer

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links