Speech synthesis apparatus having prosody generator with user-set speech-rate- or adjusted phoneme-duration-dependent selective vowel devoicing
First Claim
1. A speech synthesis apparatus comprising:
- a text analyzer operable to generate a phonetic and prosodic symbol string from character information of an input text;
a word dictionary storing a reading and accent of a word;
a voice segment dictionary storing a phoneme that is a basic unit of speech;
a prosody generator operable to generate synthesizing parameters including at least a phoneme, a duration of the phoneme and a fundamental frequency for the phonetic and prosodic symbol string, the prosody generator including a vowel devoicing determining means operable to determine whether or not a vowel devoicing process is to be performed and a duration modifying means operable to modify the duration of the phoneme depending on a speech rate set by a user, the vowel devoicing determining means determining that the vowel devoicing process is not devoiced when the set speech rate is slower than a predetermined rate; and
a waveform generator operable to generate a synthesized waveform by making waveform-overlap-adding referring to the synthesizing parameters generated by the prosody generator and the voice segment dictionary.
5 Assignments
0 Petitions
Accused Products
Abstract
The speech synthesis apparatus according to the present invention includes a text analyzer operable to generate a phonetic and prosodic symbol string from text information of an input text; a word dictionary storing a reading and accent of a word; a voice segment dictionary storing a phoneme that is a basic unit of speech; a prosody generator operable to generate synthesizing parameters including at least a phoneme, a duration of the phoneme and a fundamental frequency for the phonetic and prosodic symbol string, the prosody generator including a vowel devoicing determining means operable to determine whether or not a vowel devoicing process is to be performed and a duration modifying means operable to modify the duration of the phoneme depending on a speech rate set by a user, the vowel devoicing determining means determining that the vowel devoicing process is not performed when the set speech rate is slower than a predetermined rate; and a waveform generator operable to generate a synthesized waveform by making waveform overlap-adding referring to the synthesizing parameters generated by the prosody generator and the voice segment dictionary.
68 Citations
7 Claims
-
1. A speech synthesis apparatus comprising:
-
a text analyzer operable to generate a phonetic and prosodic symbol string from character information of an input text;
a word dictionary storing a reading and accent of a word;
a voice segment dictionary storing a phoneme that is a basic unit of speech;
a prosody generator operable to generate synthesizing parameters including at least a phoneme, a duration of the phoneme and a fundamental frequency for the phonetic and prosodic symbol string, the prosody generator including a vowel devoicing determining means operable to determine whether or not a vowel devoicing process is to be performed and a duration modifying means operable to modify the duration of the phoneme depending on a speech rate set by a user, the vowel devoicing determining means determining that the vowel devoicing process is not devoiced when the set speech rate is slower than a predetermined rate; and
a waveform generator operable to generate a synthesized waveform by making waveform-overlap-adding referring to the synthesizing parameters generated by the prosody generator and the voice segment dictionary. - View Dependent Claims (2, 3, 4)
-
-
5. A speech synthesis apparatus comprising:
-
a text analyzer operable to generate a phonetic and prosodic symbol string from character information of an input text;
a word dictionary storing a reading and accent of a word;
a voice segment dictionary storing a phoneme that is a unit of speech;
a prosody generator operable to generate synthesizing parameters including at least a phoneme, a duration of the phoneme and a fundamental frequency for the phonetic and prosodic symbol string, the prosody generator including a vowel devoicing determining means operable to determine whether or not a vowel devoicing process is performed and a duration modifying means operable to modify the duration of the phoneme depending on a speech rate set by a user and a result of the determination by the vowel devoicing determining means, wherein the duration modifying means does not stretch the duration of the phoneme for a voiceless sound beyond a predetermined limitation value; and
a waveform generator operable to generate a synthesized waveform by making waveform-overlap-adding referring to the synthesizing parameters generated by the prosody generator and the voice segment dictionary. - View Dependent Claims (6, 7)
-
Specification