Speech synthesis apparatus and speech synthesis method
First Claim
1. A speech synthesis apparatus that obtains text data and converts text indicated by the text data into speech, comprising:
- a storage unit operable to previously store, with respect to each speech-unit, speech-unit data that represents (i) a loan word attribute indicating whether or not a speech-unit belongs to a class of loan words and (ii) an acoustic characteristic of the speech-unit;
a characteristic prediction unit operable to obtain text data and predict, with respect to each of a plurality of speech-units that form text indicated by the text data, a loan word attribute and an acoustic characteristic;
a selection unit operable to select speech-unit data that represents a loan word attribute and an acoustic characteristic similar to the loan word attribute and the acoustic characteristic of each speech-unit predicted by the characteristic prediction unit, from among the speech-unit data stored in the storage unit; and
a speech output unit operable to generate synthesized speech using a plurality of the speech-unit data selected by the selection unit and output the synthesized speech.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention includes: a characteristic parameter DB 106 that holds, with respect to each speech-unit, speech-unit data indicating a loan word attribute and acoustic characteristics; a language analysis unit 104 and a prosody prediction unit 109 that obtain text data and respectively predict a loan word attribute and acoustic characteristics of each of a plurality of speech-units that form text indicated by the text data; a speech-unit selection unit 108 that selects, from the characteristic parameter DB 106, speech-unit data that represents the loan word attribute and the acoustic characteristics similar to the predicted loan word attribute and acoustic characteristics of each speech-unit; and a speech synthesis unit 110 that generates synthesized speech using a plurality of the selected speech-units and outputs the synthesized speech.
-
Citations
18 Claims
-
1. A speech synthesis apparatus that obtains text data and converts text indicated by the text data into speech, comprising:
-
a storage unit operable to previously store, with respect to each speech-unit, speech-unit data that represents (i) a loan word attribute indicating whether or not a speech-unit belongs to a class of loan words and (ii) an acoustic characteristic of the speech-unit;
a characteristic prediction unit operable to obtain text data and predict, with respect to each of a plurality of speech-units that form text indicated by the text data, a loan word attribute and an acoustic characteristic;
a selection unit operable to select speech-unit data that represents a loan word attribute and an acoustic characteristic similar to the loan word attribute and the acoustic characteristic of each speech-unit predicted by the characteristic prediction unit, from among the speech-unit data stored in the storage unit; and
a speech output unit operable to generate synthesized speech using a plurality of the speech-unit data selected by the selection unit and output the synthesized speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A speech synthesis method for obtaining text data and converting text indicated by the text data into speech using data stored in a storage unit,
wherein the storage unit previously stores, with respect to each speech-unit, speech-unit data that represents (i) a loan word attribute indicating whether or not a speech-unit belongs to a class of loan words and (ii) an acoustic characteristic of the speech-unit, and the method comprises: -
obtaining text data and predicting, with respect to each of a plurality of speech-units that form text indicated by the text data, a loan word attribute and an acoustic characteristic of the speech-unit;
selecting speech-unit data that represents a loan word attribute and an acoustic characteristic similar to the predicted loan word attribute and acoustic characteristic of each speech-unit, from among the speech-unit data stored in the storage unit; and
generating synthesized speech using a plurality of the selected speech-unit data and outputting the synthesized speech. - View Dependent Claims (13)
-
-
14. A program for obtaining text data and converting text indicated by the text data into speech using data stored in a storage unit,
wherein the storage unit previously stores, with respect to each speech-unit, speech-unit data that represents (i) a loan word attribute indicating whether or not a speech-unit belongs to a class of loan words and (ii) an acoustic characteristic of the speech-unit, and the program causes a computer to execute: -
obtaining text data and predicting, with respect to each of a plurality of speech-units that form text indicated by the text data, a loan word attribute and an acoustic characteristic of the speech-unit;
selecting speech-unit data a loan word attribute and an acoustic characteristic similar to the predicted loan word attribute and acoustic characteristic of each speech-unit, from among the speech-unit data stored in the storage unit; and
generating synthesized speech using a plurality of the selected speech-unit data and outputting the synthesized speech.
-
-
15. A data creation apparatus that creates speech-unit data to be used for speech synthesis, comprising:
-
a speech storage unit operable to store a speech waveform signal that represents speech in a waveform;
a text storage unit operable to store text data indicating text that corresponds to the speech represented by the speech waveform signal;
a language analysis unit operable to obtain text data from the text storage unit, divide text indicated by the text data into speech-units, and analyze a loan word attribute of each speech-unit indicating whether or not the speech-unit belongs to a class of loan words;
an acoustic analysis unit operable to obtain a speech waveform signal from the speech storage unit, divide the speech represented by the speech waveform signal into speech-units, and analyze an acoustic characteristic of each speech-unit; and
a creation unit operable to create speech-unit data of each speech-unit so that said speech-unit data indicates the loan word attribute as analyzed by the language analysis unit and the acoustic characteristic as analyzed by the acoustic analysis unit, and store the created speech-unit data into a memory. - View Dependent Claims (16, 17)
-
-
18. A data creation method for creating speech-unit data to be used for speech synthesis using data stored in a storage unit,
wherein the storage unit previously stores a speech waveform signal that represents speech in a waveform and text data indicating text that corresponds to the speech represented by the speech waveform signal, and the method comprises: -
obtaining text data from the text storage unit, dividing text indicated by the text data into speech-units, and analyzing a loan word attribute of each speech-unit indicating whether or not the speech-unit belongs to a class of loan words;
obtaining a speech waveform signal from the speech storage unit, dividing the speech represented by the speech waveform signal into speech-units, and analyzing an acoustic characteristic of each speech-unit; and
creating speech-unit data of each speech-unit so that said speech-unit data indicates the analyzed loan word attribute and acoustic characteristic, and storing the created speech-unit data into a memory.
-
Specification