System and method for speech synthesis using a smoothing filter

US 20030083878A1
Filed: 10/31/2002
Published: 05/01/2003
Est. Priority Date: 10/31/2001
Status: Active Grant

First Claim

Patent Images

1. A speech synthesis system for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising:

a discontinuous distortion processing means for predicting a discontinuity occurred at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and controlling so that a discontinuity occurred at the transition portion between the concatenated samples of phonemes of the synthesized speech is smoothed adaptively to correspond to a degree of the predicted discontinuity.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed is a speech synthesis system and method using a smoothing filter. A speech synthesis system for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising: a discontinuous distortion processing means adapted to predict a discontinuity occurred at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and control a discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech in such a fashion that it is smoothed adaptively to correspond to a degree of the predicted discontinuity. The smoothing filter smoothes the synthesized speech so that the discontinuity degree of synthesized speech follows the predicted discontinuity degree according to the filter coefficient (a) changed adaptively to correspond to a ratio of the predicted discontinuity degree to the real discontinuity degree. That is, since a discontinuity occurred at a transition portion between concatenated phonemes of the synthesized speech (IN) is adaptively smoothed to follow that occurred in the actually spoken sound, the synthesized speech (IN) can be approximated more closely to a real human voice.

Citations

18 Claims

1. A speech synthesis system for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising:
- a discontinuous distortion processing means for predicting a discontinuity occurred at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and controlling so that a discontinuity occurred at the transition portion between the concatenated samples of phonemes of the synthesized speech is smoothed adaptively to correspond to a degree of the predicted discontinuity.
- View Dependent Claims (2)
- - 2. The speech synthesis system as claimed claim 1, wherein the predetermined learning process is performed by CART (Classification and Regression Tree) scheme.

3. A speech synthesis system comprising:
- a smoothing filter for smoothing the discontinuity occurred at the transition portion between concatenated phonemes of the synthesized speech to correspond to a filter coefficient α
  
  ;
  
  a filter characteristics controller for comparing a degree of a real discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech with a degree of a discontinuity predicted according to the result obtained from a predetermined learning process using the phoneme samples employed for speech synthesis, and outputting the compared result as a coefficient selecting signal R; and
  
  filter coefficient determining means for determining the filter coefficient in response to the coefficient selecting signal so as to allow the smoothing filter to smooth the discontinuous distortion occurred at the transition portion between the concatenated phonemes of the synthesized speech according to the degree of the predicted discontinuity.
- View Dependent Claims (4, 5, 6, 7)
- - 4. The speech synthesis system as claimed in claim 3, wherein the predetermined learning process is performed by CART (Classification and Regression Tree) scheme.
  - 5. The speech synthesis system as claimed in claim 4, wherein the phoneme samples used for the prediction of the discontinuity comprises quadraphones (four phonemes) consisting of two phonemes before a transition portion between concatenated phonemes in which to predict a discontinuity and two phonemes after the transition portion.
  - 6. The speech synthesis system as claimed in claim 3, wherein the coefficient selecting signal R is obtained by the following formula:
  - 7. The speech synthesis system as claimed in claim 3, wherein the filter coefficient determining means determines the filter coefficient α
    - by the following formula in response to the coefficient selecting signal R;

8. A speech synthesis method for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes of a synthesized speech using a smoothing technique, comprising the steps of:
- (a) comparing a degree of a real discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech with a degree of a discontinuity predicted according to the result obtained from a predetermined learning process using concatenated samples of phonemes employed for speech synthesis;
  
  (b) determining a filter coefficient corresponding to the compared result from the step (a) so as to smooth the discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech according to the degree of the predicted discontinuity; and
  
  (c) smoothing a discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech to correspond to the determined filter coefficient.
- View Dependent Claims (9)
- - 9. A recording medium for recording the speech synthesis method as claimed in claim 8 by using a program code executable in a computer.

10. A smoothing filter characteristics control device for adaptively changing, according to the characteristics of a transition portion between concatenated phonemes which are speech units of a synthesized speech, the characteristics of a smoothing filter used in a speech synthesis system for controlling a discontinuous distortion occurred at the transition portion, the device comprising:
- discontinuity measuring means which obtains a degree of a discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech as a real discontinuity degree and outputs the obtained real discontinuity degree;
  
  discontinuity predicting means which stores a result of learning of discontinuity prediction occurred at a transition portion between concatenated phonemes in an actually spoken sound therein and predicts a degree of a discontinuity occurred at the transition portion between the input concatenated samples of phonemes in response to the result of the learning when the concatenated samples of phonemes employed for speech synthesis of the synthesized speech are input, and outputs the degree of the predicted discontinuity; and
  
  a comparator which compares the predicted discontinuity degree Dp applied thereto from the discontinuity predicting means with the real discontinuity degree Dr applied thereto from the discontinuity measuring means, and generates the compared result as a coefficient selecting signal for determining a filter coefficient of the smoothing filter.
- View Dependent Claims (11, 12, 13, 14, 15)
- - 11. The smoothing filter characteristics control device as claimed in claim 10, wherein the learning in the discontinuity predicting means is performed by CART (Classification and Regression Tree) scheme.
  - 12. The smoothing filter characteristics control device as claimed in claim 11, wherein the phoneme samples used for the prediction of the discontinuity comprises quadraphones (four phonemes) consisting of two phonemes before a transition portion between concatenated phonemes in which to predict a discontinuity and two phonemes after the transition portion.
  - 13. The smoothing filter characteristics control device as claimed in claim 12, wherein the predicted discontinuity degree D_pand the real discontinuity degree D_rare obtained by the following formulas;
    - D_r=∥
      
      W_p−
      
      W_n∥
      
      ²D_p=∥
      
      W′
      
      _p−
      
      W′
      
      _n∥
      
      ²where W_pis a speech waveform of the last pitch cycle of speech units arranged on the left side with respect to a transition portion between concatenated speech units in which to measure a degree of a discontinuity in the synthesized speech, W_nis a speech waveform of the first pitch cycle of speech units arranged on the right side with respect to the transition portion in which to measure the discontinuity degree, W′
      
      _pis a speech waveform of the last pitch cycle of speech units arranged on the left side with respect to a transition portion between concatenated speech units in which to predict a degree of a discontinuity in the actually spoken sound, and W′
      
      _nis a speech waveform of the first pitch cycle of speech units arranged on the right side with respect to the transition portion in which to predict the discontinuity degree.
  - 14. The smoothing filter characteristics control device as claimed in claim 10, wherein the comparator generates a coefficient selecting signal R obtained by the following formula:
  - 15. The smoothing filter characteristics control device as claimed in claim 10, wherein the filter coefficient α
    - is determined by the following formula in response to the coefficient selecting signal R;

16. A smoothing filter characteristics control method for adaptively changing, according to the characteristics of a transition portion between concatenated phonemes which are speech units of a synthesized speech, the characteristics of a smoothing filter used in a speech synthesis system for controlling a discontinuous distortion occurred at the transition portion, the method comprising the steps of:
- (a) learning prediction of a discontinuity occurred at a transition portion between concatenated phonemes in an actually spoken sound using samples of phonemes;
  
  (b) obtaining, as a real discontinuity degree, a degree of the discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech to output the obtained real discontinuity degree;
  
  (c) obtaining the degree of the predicted discontinuity by predicting a degree of a discontinuity occurred at the transition portion between the concatenated samples of phonemes employed for speech synthesis of the synthesized speech according to the result of the learning; and
  
  (d) determining a filter coefficient of the smoothing filter according to the predicted discontinuity degree and the real discontinuity degree.
- View Dependent Claims (17, 18)
- - 17. A smoothing filter characteristics control method as claimed in claim 16 wherein the step (d) further comprises the steps of:
    - (d1) obtaining a ratio R of the predicted discontinuity degree to the real discontinuity degree; and
      
      (d2) determining the filter coefficient α
      
      by the following formula;
      
      $α = \frac{1}{2} (\sqrt{R} + 1) .$
  - 18. A recording medium for recording the smoothing filter characteristics control method as claimed in claim 16 by using a program code executable in a computer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Samsung Electronics Co. Ltd.
Inventors
Lee, Ki-Seung, Lee, Jae-Won, Kim, Jeong-Su

Granted Patent

US 7,277,856 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/266
CPC Class Codes

G10L 13/07 Concatenation rules

System and method for speech synthesis using a smoothing filter

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for speech synthesis using a smoothing filter

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links