×

Statistical enhancement of speech output from a statistical text-to-speech synthesis system

  • US 8,682,670 B2
  • Filed: 07/07/2011
  • Issued: 03/25/2014
  • Est. Priority Date: 07/07/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method for enhancement of speech synthesized by a statistical text-to-speech (TTS) system employing a parametric representation of short-time spectral envelope of speech in a space of acoustic feature vectors, comprising:

  • defining a parametric family of corrective transformations operating in the space of the acoustic feature vectors and dependent on a set of enhancing parameters, wherein number of the enhancing parameters in the set of enhancing parameters is less than a dimension of the space of the acoustic feature vectors;

    defining a distortion indicator of a feature vector or a plurality of feature vectors, wherein the distortion indicator is not modelled directly by the statistical TTS system;

    receiving a feature vector output by the system;

    generating an instance of the corrective transformation by;

    calculating a reference value of the distortion indicator attributed to a statistical model of the phonetic unit emitting the feature vector;

    calculating an actual value of the distortion indicator attributed to feature vectors emitted by the statistical model of the phonetic unit emitting the feature vector;

    calculating the enhancing parameter values depending on the reference value of the distortion indicator, the actual value of the distortion indicator and the parametric corrective transformation;

    deriving an instance of the corrective transformation corresponding to the enhancing parameter values from the parametric family of the corrective transformations; and

    applying the instance of the corrective transformation to the feature vector to provide an enhanced feature vector.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×