System and method for speech synthesis using a smoothing filter
First Claim
1. A speech synthesis system for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising:
- a discontinuous distortion processing means for predicting a discontinuity occurred at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and controlling so that a discontinuity occurred at the transition portion between the concatenated samples of phonemes of the synthesized speech is smoothed adaptively to correspond to a degree of the predicted discontinuity.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed is a speech synthesis system and method using a smoothing filter. A speech synthesis system for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising: a discontinuous distortion processing means adapted to predict a discontinuity occurred at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and control a discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech in such a fashion that it is smoothed adaptively to correspond to a degree of the predicted discontinuity. The smoothing filter smoothes the synthesized speech so that the discontinuity degree of synthesized speech follows the predicted discontinuity degree according to the filter coefficient (a) changed adaptively to correspond to a ratio of the predicted discontinuity degree to the real discontinuity degree. That is, since a discontinuity occurred at a transition portion between concatenated phonemes of the synthesized speech (IN) is adaptively smoothed to follow that occurred in the actually spoken sound, the synthesized speech (IN) can be approximated more closely to a real human voice.
-
Citations
18 Claims
-
1. A speech synthesis system for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising:
a discontinuous distortion processing means for predicting a discontinuity occurred at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and controlling so that a discontinuity occurred at the transition portion between the concatenated samples of phonemes of the synthesized speech is smoothed adaptively to correspond to a degree of the predicted discontinuity. - View Dependent Claims (2)
-
3. A speech synthesis system comprising:
-
a smoothing filter for smoothing the discontinuity occurred at the transition portion between concatenated phonemes of the synthesized speech to correspond to a filter coefficient α
;
a filter characteristics controller for comparing a degree of a real discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech with a degree of a discontinuity predicted according to the result obtained from a predetermined learning process using the phoneme samples employed for speech synthesis, and outputting the compared result as a coefficient selecting signal R; and
filter coefficient determining means for determining the filter coefficient in response to the coefficient selecting signal so as to allow the smoothing filter to smooth the discontinuous distortion occurred at the transition portion between the concatenated phonemes of the synthesized speech according to the degree of the predicted discontinuity. - View Dependent Claims (4, 5, 6, 7)
-
-
8. A speech synthesis method for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes of a synthesized speech using a smoothing technique, comprising the steps of:
-
(a) comparing a degree of a real discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech with a degree of a discontinuity predicted according to the result obtained from a predetermined learning process using concatenated samples of phonemes employed for speech synthesis;
(b) determining a filter coefficient corresponding to the compared result from the step (a) so as to smooth the discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech according to the degree of the predicted discontinuity; and
(c) smoothing a discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech to correspond to the determined filter coefficient. - View Dependent Claims (9)
-
-
10. A smoothing filter characteristics control device for adaptively changing, according to the characteristics of a transition portion between concatenated phonemes which are speech units of a synthesized speech, the characteristics of a smoothing filter used in a speech synthesis system for controlling a discontinuous distortion occurred at the transition portion, the device comprising:
-
discontinuity measuring means which obtains a degree of a discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech as a real discontinuity degree and outputs the obtained real discontinuity degree;
discontinuity predicting means which stores a result of learning of discontinuity prediction occurred at a transition portion between concatenated phonemes in an actually spoken sound therein and predicts a degree of a discontinuity occurred at the transition portion between the input concatenated samples of phonemes in response to the result of the learning when the concatenated samples of phonemes employed for speech synthesis of the synthesized speech are input, and outputs the degree of the predicted discontinuity; and
a comparator which compares the predicted discontinuity degree Dp applied thereto from the discontinuity predicting means with the real discontinuity degree Dr applied thereto from the discontinuity measuring means, and generates the compared result as a coefficient selecting signal for determining a filter coefficient of the smoothing filter. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A smoothing filter characteristics control method for adaptively changing, according to the characteristics of a transition portion between concatenated phonemes which are speech units of a synthesized speech, the characteristics of a smoothing filter used in a speech synthesis system for controlling a discontinuous distortion occurred at the transition portion, the method comprising the steps of:
-
(a) learning prediction of a discontinuity occurred at a transition portion between concatenated phonemes in an actually spoken sound using samples of phonemes;
(b) obtaining, as a real discontinuity degree, a degree of the discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech to output the obtained real discontinuity degree;
(c) obtaining the degree of the predicted discontinuity by predicting a degree of a discontinuity occurred at the transition portion between the concatenated samples of phonemes employed for speech synthesis of the synthesized speech according to the result of the learning; and
(d) determining a filter coefficient of the smoothing filter according to the predicted discontinuity degree and the real discontinuity degree. - View Dependent Claims (17, 18)
-
Specification