Device and method for synthesizing speech
First Claim
1. A speech synthesis device comprising:
- speech database storing means for storing sample waveform data in a speech unit and a speech database created by way of associating the sample sound waveform data with their corresponding phonetic information;
speech waveform composing means for dividing phonetic information into speech units upon receiving the phonetic information of speech sound to be synthesized, for obtaining sample speech waveform data corresponding to the each phonetic information in a speech unit from the speech database, and for generating speech waveform data to be composed by means of concatenating the sample speech waveform data in speech units; and
analog converting means for converting the speech waveform data received from the speech waveform composing means into analog signals;
wherein the speech waveform composing means comprises pitch converting means for converting pitch by means of processing a segment of a waveform in which the waveform is converging on a segment just before a minus peak during a periodical unit of speech waveform data,at said segment the speech waveform being depending on vocal tract shape and being attending and converging on the minus peak.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides pitch conversion processing technology capable of minimizing the distortion of speech sound naturalness. A speech waveform in a pitch-unit is considered to be divided into two segments: 1) the segment of β, that starts from the minus peak, where the waveform depending on the shape of vocal tracts appears, and 2) the segment of γ where the waveform depending on the vocal tract shape is attenuating and converging on the next minus peak. In addition, α is the point where a minus peak appears along with the glottal closure. Based on characteristics of speech waveforms, the present invention processes waveform for converting pitch in the segment of γ just before the next minus peak, which is least affected by the minus peak associated with the glottal closure. As such, waveform processing can be performed by keeping the complete contour of waveform at around the peak, and thereby reducing the effects of pitch conversion.
-
Citations
12 Claims
-
1. A speech synthesis device comprising:
-
speech database storing means for storing sample waveform data in a speech unit and a speech database created by way of associating the sample sound waveform data with their corresponding phonetic information; speech waveform composing means for dividing phonetic information into speech units upon receiving the phonetic information of speech sound to be synthesized, for obtaining sample speech waveform data corresponding to the each phonetic information in a speech unit from the speech database, and for generating speech waveform data to be composed by means of concatenating the sample speech waveform data in speech units; and analog converting means for converting the speech waveform data received from the speech waveform composing means into analog signals; wherein the speech waveform composing means comprises pitch converting means for converting pitch by means of processing a segment of a waveform in which the waveform is converging on a segment just before a minus peak during a periodical unit of speech waveform data, at said segment the speech waveform being depending on vocal tract shape and being attending and converging on the minus peak. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-readable storing medium storing a program for executing pitch conversion using a computer having speech database storing means for storing sample waveform data in a speech unit and a speech database created by way of associating the sample sound waveform data with their corresponding phonetic information, the program comprising the step of:
-
dividing phonetic information into speech units upon receiving the phonetic information of speech sound to be synthesized, obtaining sample speech waveform data corresponding to the each phonetic information in a speech unit from the speech database, converting pitch by means of processing a segment of a waveform in which the waveform is converging on a segment just before a minus peak during a periodical unit of speech waveform data, at said segment, the speech waveform being depending on vocal tract shape and being attending and converging on the minus peak, and generating speech waveform data to be composed by means of concatenating the sample speech waveform data in speech units. - View Dependent Claims (7, 8, 9)
-
-
10. A speech synthesis device comprising:
-
speech database storing means for storing a speech database having several sample speech waveform data with various pitch lengths for each speech unit and phonetic information associated with the sample waveform data; speech waveform composing means for dividing phonetic information into speech units upon receiving phonetic information of speech sound to be synthesized, for obtaining a desirable sample speech waveform data from among the sample speech waveform data corresponding to the divided phonetic information in a speech unit in the speech database, and for generating speech waveform data to be composed by means of concatenating the obtained sample speech waveform data in speech units; and analog converting means for converting the speech waveform data received from the speech waveform composing means into analog signals; wherein the speech database is constructed of several sample speech waveform data with various pitch lengths prepared by modifying a contour of a waveform in a segment in which the waveform is converging on the minus peak during a periodical unit of speech waveform data.
-
-
11. A computer-readable storing medium storing a program for executing speech synthesis by means of a computer using a speech database, the program comprising the steps of:
-
receiving phonetic information of speech sound to be synthesized and dividing phonetic information into speech units; obtaining a desirable sample speech waveform data from among sample speech waveform data corresponding to the divided phonetic information in a speech unit in the speech database; and generating speech waveform data to be composed by means of concatenating the obtained sample speech waveform data in speech units; wherein the speech database is constructed of several sample speech waveform data with various pitch lengths prepared by modifying a contour of a waveform in a segment in which the waveform is converging on a minus peak during a periodical unit of speech waveform data.
-
-
12. A method of pitch conversion for speech waveform, the method comprising the steps of:
-
preparing speech database for storing sample waveform data in a speech unit and a speech database created by way of associating the sample sound waveform data with their corresponding phonetic information, dividing phonetic information into speech units upon receiving the phonetic information of speech sound to be synthesized, obtaining sample speech waveform data corresponding to the each phonetic information in a speech unit from the speech database, converting pitch by means of processing a segment of a waveform in which the waveform is converging on a segment just before a minus peak during a periodical unit of speech waveform data, at said segment the speech waveform being depending on vocal tract shape and being attending and converging on the minus peak, and generating speech waveform data to be composed by means of concatenating the sample speech waveform data in speech units.
-
Specification