System and method for speech synthesis using a smoothing filter
First Claim
1. A speech synthesis system for controlling a discontinuous distortion that occurs at a transition portion between concatenated phonemes, which are speech units of synthesized speech, using a smoothing technique, comprising:
- a discontinuous distortion processing means for predicting a discontinuity at a transition portion between concatenated samples of phonemes used for speech synthesis through a predetermined learning process, and for controlling speech synthesis so that a discontinuity at the transition portion between the concatenated phonemes of the synthesized speech is smoothed adaptively to correspond to a degree of the predicted discontinuity determined according to a result of the predetermined learning process.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech synthesis system for controlling a discontinuous distortion that occurs at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising: a discontinuous distortion processing means adapted to predict a discontinuity at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and control a discontinuity at the transition portion between the concatenated phonemes of the synthesized speech in such a fashion that it is smoothed adaptively to correspond to a degree of the predicted discontinuity. The smoothing filter smoothes the synthesized speech so that the discontinuity degree of synthesized speech follows the predicted discontinuity degree according to the filter coefficient (a) changed adaptively to correspond to a ratio of the predicted discontinuity degree to the real discontinuity degree. That is, since a discontinuity at a transition portion between concatenated phonemes of the synthesized speech (IN) is adaptively smoothed to follow that which occurs in the actually spoken sound, the synthesized speech (IN) can be approximated more closely to a real human voice.
-
Citations
18 Claims
-
1. A speech synthesis system for controlling a discontinuous distortion that occurs at a transition portion between concatenated phonemes, which are speech units of synthesized speech, using a smoothing technique, comprising:
a discontinuous distortion processing means for predicting a discontinuity at a transition portion between concatenated samples of phonemes used for speech synthesis through a predetermined learning process, and for controlling speech synthesis so that a discontinuity at the transition portion between the concatenated phonemes of the synthesized speech is smoothed adaptively to correspond to a degree of the predicted discontinuity determined according to a result of the predetermined learning process. - View Dependent Claims (2)
-
3. A speech synthesis system comprising:
-
a smoothing filter for smoothing a discontinuity that occurs at a transition portion between concatenated phonemes of synthesized speech employing a filter coefficient α
;a filter characteristics controller for comparing a degree of a real discontinuity at the transition portion between the concatenated phonemes of the synthesized speech with a degree of a discontinuity predicted according to a result obtained from a predetermined learning process using phoneme samples employed for speech synthesis, and outputting the comparison result as a coefficient selecting signal R; and filter coefficient determining means for determining the filter coefficient α
in response to the coefficient selecting signal R so as to allow the smoothing filter to smooth discontinuous distortion at the transition portion between the concatenated phonemes of the synthesized speech according to the degree of the predicted discontinuity. - View Dependent Claims (4, 5, 6, 7)
-
-
8. A speech synthesis method for controlling a discontinuous distortion that occurs at a transition portion between concatenated phonemes of synthesized speech using a smoothing technique, comprising the steps of:
-
(a) comparing a degree of a real discontinuity at the transition portion between the concatenated phonemes of the synthesized speech with a degree of a discontinuity predicted according to a result obtained from a predetermined learning process using concatenated samples of phonemes employed for speech synthesis; (b) determining a filter coefficient corresponding to the compared result from the step (a) so as to smooth the discontinuity at the transition portion between the concatenated phonemes of the synthesized speech according to the degree of the predicted discontinuity; and (c) smoothing a discontinuity at the transition portion between the concatenated phonemes of the synthesized speech to correspond to the determined filter coefficient. - View Dependent Claims (9)
-
-
10. A smoothing filter characteristics control device for adaptively changing, according to the characteristics of a transition portion between concatenated phonemes, which are speech units of synthesized speech, the characteristics of a smoothing filter used in a speech synthesis system for controlling a discontinuous distortion that occurs at the transition portion, the device comprising:
-
discontinuity measuring means which obtains a degree of a discontinuity at the transition portion between the concatenated phonemes of the synthesized speech as a real discontinuity degree and outputs the obtained real discontinuity degree; discontinuity predicting means which stores a result of a learning process predicting discontinuity at a transition portion between concatenated phonemes in actually spoken sounds using samples of phonemes, predicts a degree of a discontinuity at a transition portion between input concatenated samples of phonemes employed for speech synthesis of the synthesized speech according to the result of the learning, and outputs the degree of the predicted discontinuity; and a comparator which compares the predicted discontinuity degree Dp applied thereto from the discontinuity predicting means with the real discontinuity degree Dr applied thereto from the discontinuity measuring means, and generates the compared result as a coefficient selecting signal for determining a filter coefficient of the smoothing filter. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A smoothing filter characteristics control method for adaptively changing, according to characteristics of a transition portion between concatenated phonemes, which are speech units of synthesized speech, characteristics of a smoothing filter used in a speech synthesis system for controlling a discontinuous distortion that occurs at the transition portion, the method comprising the steps of:
-
(a) storing a result of a learning process predicting a discontinuity at a transition portion between concatenated phonemes in actually spoken sounds using samples of phonemes; (b) obtaining a real degree of the discontinuity at the transition portion between the concatenated phonemes of the synthesized speech and outputting the obtained real discontinuity degree; (c) predicting a degree of a discontinuity at a transition portion between input concatenated samples of phonemes employed for speech synthesis of the synthesized speech according to the result of the learning and outputting the predicted discontinuity degree; and (d) determining a filter coefficient of the smoothing filter according to the predicted discontinuity degree and the real discontinuity degree. - View Dependent Claims (17, 18)
-
Specification