×

METHOD FOR FORMING THE EXCITATION SIGNAL FOR A GLOTTAL PULSE MODEL BASED PARAMETRIC SPEECH SYNTHESIS SYSTEM

  • US 20160027430A1
  • Filed: 10/06/2015
  • Published: 01/28/2016
  • Est. Priority Date: 05/28/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method for creating parametric models for use in training a speech synthesis system, wherein the system comprises at least a training text corpus, a speech database, and a model training module, the method comprising:

  • a. obtaining, by the model training module, speech data for the training text corpus, wherein the speech data comprises recorded speech signals and corresponding transcriptions;

    b. converting, by the model training module, the training text corpus into context dependent phone labels;

    c. extracting, by the model training module, for each frame of speech in the speech signal from the speech training database, at least one of;

    spectral features, a plurality of band excitation energy coefficients, and fundamental frequency values;

    d. forming, by the model training module, a feature vector stream for each frame of speech using the at least one of;

    spectral features, a plurality of band excitation energy coefficients, and fundamental frequency values;

    e. labeling speech with context dependent phones;

    f. extracting durations of each context dependent phone from the labelled speech;

    g. performing parameter estimation of the speech signal, wherein the parameter estimation is performed comprising the features, HMM, and decision trees; and

    h. identifying a plurality of sub-band Eigen glottal pulses, wherein the sub-band Eigen glottal pulses comprise separate models used to form excitation during synthesis.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×