SPEECH SYNTHESIS APPARATUS AND METHOD THEREOF
First Claim
1. A speech synthesis apparatus that obtains waveform data of synthesis fragments corresponding to a plurality of synthesis units in a prescribed processing unit included in an input synthesis unit string and synthesizes speech by connecting the waveform data, comprising:
- an attribute information storage medium that stores the attribute information of said synthesis fragments other than the waveform data;
a plurality of waveform data storage mediums that store waveform data of said synthesis fragments, time required for obtaining said stored waveform data from said waveform data storage mediums being different among one another;
a data positional information storage medium that stores data positional information including the identifier of a waveform data storage medium that stores said waveform data for each said synthesis fragment;
a candidate obtaining device configured to obtain a synthesis fragment candidate corresponding to each said synthesis unit from said attribute information storage medium based on the attribute information of each said synthesis unit in said processing unit;
a synthesis fragment selector configured to obtain a plurality of series each including a combination of a plurality of synthesis fragment candidates obtained for each said synthesis unit and selects one series from said plurality of series based on said data positional information so that the total time required for obtaining the waveform data of said synthesis fragments in said processing unit does not exceed the upper limit for data obtaining time; and
a synthesis fragment generator configured to combine synthesis fragments on said selected one series to generate a synthesis fragment string; and
a waveform generator configured to obtain the waveform data of the synthesis fragments included in said synthesis fragment string from each said waveform data storage medium and connects the waveform data.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech synthesis apparatus includes a text obtaining device that obtains text data for speech synthesis from the outside, a language processor that carries out morphological analysis/parsing to the text data, a prosodic processor that outputs, to a speech synthesizer, a synthesis unit string based on the prosodic and language related attributes of the text data such as accents and word classes, the speech synthesizer that generates synthesized speech from the synthesis unit string, and a speech waveform output device that reproduces a prescribed amount of output synthesized speech after it is accumulated or sequentially as it is output.
18 Citations
10 Claims
-
1. A speech synthesis apparatus that obtains waveform data of synthesis fragments corresponding to a plurality of synthesis units in a prescribed processing unit included in an input synthesis unit string and synthesizes speech by connecting the waveform data, comprising:
-
an attribute information storage medium that stores the attribute information of said synthesis fragments other than the waveform data; a plurality of waveform data storage mediums that store waveform data of said synthesis fragments, time required for obtaining said stored waveform data from said waveform data storage mediums being different among one another; a data positional information storage medium that stores data positional information including the identifier of a waveform data storage medium that stores said waveform data for each said synthesis fragment; a candidate obtaining device configured to obtain a synthesis fragment candidate corresponding to each said synthesis unit from said attribute information storage medium based on the attribute information of each said synthesis unit in said processing unit; a synthesis fragment selector configured to obtain a plurality of series each including a combination of a plurality of synthesis fragment candidates obtained for each said synthesis unit and selects one series from said plurality of series based on said data positional information so that the total time required for obtaining the waveform data of said synthesis fragments in said processing unit does not exceed the upper limit for data obtaining time; and a synthesis fragment generator configured to combine synthesis fragments on said selected one series to generate a synthesis fragment string; and a waveform generator configured to obtain the waveform data of the synthesis fragments included in said synthesis fragment string from each said waveform data storage medium and connects the waveform data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of synthesizing speech by obtaining waveform data of synthesis fragments corresponding to a plurality of synthesis units within a prescribed processing unit included in an input synthesis unit string from a plurality of waveform data storage mediums, time for obtaining data from said waveform data storage mediums being different among one another, and synthesizing speech by connecting the data, said method comprising:
-
obtaining synthesis fragment candidates corresponding to each said synthesis unit based on the attribute information of each said synthesis unit in said processing unit from attribute information storage mediums that store the attribute information of said synthesis fragments other than the waveform data; obtaining a plurality of series made of combinations of a plurality of synthesis fragment candidates obtained for each said synthesis unit and selecting one series among said plurality of series based on data positional information including the identifier of a waveform data storage medium that stores the waveform data so that the total time for obtaining the waveform data of each said synthesis fragment in said processing unit does not exceed the upper limit for data obtaining time; combining synthesis fragments on said one selected series, thereby producing a synthesis fragment string; and obtaining the waveform data of the synthesis fragments included in said synthesis fragment string from each said waveform data storage medium, thereby connecting the waveform data.
-
-
10. A speech synthesis program product that enables a computer to obtain waveform data of synthesis fragments corresponding to a plurality of synthesis units in a prescribed processing unit included in an input synthesis unit string from a plurality of waveform data storage mediums from which time for obtaining data is different among one another, and synthesize speech by connecting the waveform data, said program product comprising the instructions of:
-
obtaining synthesis fragment candidates corresponding to each said synthesis unit based on the attribute information of each said synthesis unit in said processing unit from attribute information storage mediums that store the attribute information of said synthesis fragments other than the waveform data; obtaining a plurality of series made of combinations of a plurality of synthesis fragment candidates obtained for each said synthesis unit, thereby selecting one series among said plurality of series based on the data positional information including the identifier of a waveform data storing medium that stores said waveform data so that the total time for obtaining the waveform data of each said synthesis fragment in said processing unit does not exceed the upper limit for data obtaining time; producing a synthesis fragment string by combining synthesis fragments on said selected one series; and obtaining the waveform data of synthesis fragments included in said synthesis fragment string from each said waveform storage medium and connecting the data.
-
Specification