×

Voice quality conversion device and voice quality conversion method for converting voice quality of an input speech using target vocal tract information and received vocal tract information corresponding to the input speech

  • US 8,898,055 B2
  • Filed: 05/08/2008
  • Issued: 11/25/2014
  • Est. Priority Date: 05/14/2007
  • Status: Expired due to Fees
First Claim
Patent Images

1. A voice quality conversion device that converts voice quality of an input speech using information corresponding to the input speech, said voice quality conversion device comprising:

  • a target vowel vocal tract information hold unit configured to hold target vowel vocal tract information of each vowel, the target vowel vocal tract information indicating target voice quality;

    a vowel conversion unit configured to(i) receive vocal tract information with phoneme boundary information which is vocal tract information that corresponds to the input speech and that is added with information of (1) a phoneme in the input speech and (2) a duration of the phoneme,(ii) approximate, as a first polynomial expression, a temporal change of received vocal tract information of a vowel included in the received vocal tract information with phoneme boundary information,(iii) approximate, as a second polynomial expression, a temporal change of target vocal tract information of the vowel, the target vocal tract information being included in the target vowel vocal tract information held in said target vowel vocal tract information hold unit,(iv) approximate, as a third polynomial expression, interpolated vocal tract information of the vowel by combining (1) the first polynomial expression approximating the temporal change of the received vocal tract information of the vowel with (2) the second polynomial expression approximating the temporal change of the target vocal tract information of the vowel, and(v) convert the received vocal tract information of the vowel using the third polynomial expression approximating the interpolated vocal tract information of the vowel; and

    a synthesis unit configured to synthesize a speech using the converted vocal tract information of the vowel converted by said vowel conversion unit,wherein (i) the first polynomial expression approximates a change in the received vocal tract information of the vowel over time, (ii) the second polynomial expression approximates a change in the target vocal tract information of the vowel over time, and (iii) the third polynomial expression approximates a change in the interpolated vocal tract information of the vowel over time,wherein the first polynomial expression approximating the temporal change of the received vocal tract information of the vowel and the second polynomial expression approximating the temporal change of the target vocal tract information of the vowel have a same time period that overlaps over the entire time period of the vowel, andwherein said vowel conversion unit is configured to generate the third polynomial expression by adding the first polynomial expression with the second polynomial expression based on a predetermined conversion ratio.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×