Speech signal processing apparatus and method, and storage medium
First Claim
1. A speech signal processing apparatus for performing speech synthesis by concatenating a plurality of selected synthesis units and modifying the synthesis units based on predetermined prosody parameters, said apparatus comprising:
- distortion obtaining means for obtaining a distortion which may be generated from selection to synthesis of the synthesis units;
selection means for selecting synthesis units to be used for speech synthesis, based on the distortion obtained by said distortion obtaining means; and
speech synthesis means for performing speech synthesis based on the synthesis units selected by said selection means.
1 Assignment
0 Petitions
Accused Products
Abstract
An object of the present invention is to suppress degradation of the quality in speech synthesis by selecting synthesis units so as to minimize a distortion caused by concatenation distortions and modification distortions. For that purpose, speech synthesis is performed by extracting a plurality of synthesis units corresponding to a phoneme environment from a synthesis-unit holding unit for holding a plurality of synthesis units so as to correspond to a predetermined prosody environment, calculating a distortion of each of the plurality of extracted synthesis units, obtaining a minimum distortion within a predetermined interval determined based on the prosody environment, selecting a series of synthesis units providing a minimum-distortion path, and modifying and concatenating the synthesis units.
44 Citations
25 Claims
-
1. A speech signal processing apparatus for performing speech synthesis by concatenating a plurality of selected synthesis units and modifying the synthesis units based on predetermined prosody parameters, said apparatus comprising:
-
distortion obtaining means for obtaining a distortion which may be generated from selection to synthesis of the synthesis units;
selection means for selecting synthesis units to be used for speech synthesis, based on the distortion obtained by said distortion obtaining means; and
speech synthesis means for performing speech synthesis based on the synthesis units selected by said selection means. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A speech signal processing method comprising:
-
a distortion obtaining step of obtaining a distortion generated by concatenating a plurality of selected synthesis units and modifying the synthesis units based on predetermined prosody parameters;
a selection step of selecting synthesis units to be used for speech synthesis, based on the distortion obtained in said distortion obtaining step; and
a speech synthesis step of performing speech synthesis based on the synthesis units selected in said selection step. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
Specification