×

Apparatus and method for creating pitch wave signals, apparatus and method for compressing, expanding, and synthesizing speech signals using these pitch wave signals and text-to-speech conversion using unit pitch wave signals

  • US 7,647,226 B2
  • Filed: 03/09/2007
  • Issued: 01/12/2010
  • Est. Priority Date: 08/31/2001
  • Status: Active Grant
First Claim
Patent Images

1. A speech synthesizing apparatus, the apparatus comprising:

  • division means for dividing an input speech signal into a plurality of unit speech samples;

    signal creating means for creating a pitch wave signal from each of the unit speech samples, the pitch wave signal comprising a plurality of normalized pitch wave elements which have a substantially identical time length and uniform phase, wherein the pitch wave signal is created in such a way that a pitch signal representing pitch periods in the unit speech sample is generated and the phase of a speech wave in each pitch period is shifted so as to maximize the correlation between the speech wave in the pitch period and the pitch signal and that the phase shifted speech wave in each pitch period is resampled with the same number of samples to make uniform the time length of the speech wave in each pitch period to the same time length;

    storage means for storing rhythm information representing the rhythm of each unit speech sample, pitch information representing the pitch of the sample, the spectrum information showing variation with time in the fundamental frequency component and harmonic wave component of the pitch wave signal in such a manner that each of the rhythm information, the pitch information and the spectrum information corresponds to the sample;

    prediction means for inputting text information representing a text, and creating prediction information representing the result of predicting the pitch and spectrum of a unit speech constituting the text based on the text information;

    retrieval means for identifying a sample having a pitch and spectrum having the highest correlation with the pitch and spectrum of the unit speech constituting the text based on the pitch information, spectrum information and prediction information; and

    signal synthesizing means for creating a synthesized speech signal representing a speech in which the speech has a rhythm represented by the rhythm information brought into correspondence with the sample identified by the retrieval means, the variation with time in the fundamental frequency component and harmonic wave component is represented by the spectrum information brought into correspondence with the sample identified by the retrieval means, and the time length of one pitch period is a time length represented by the pitch information brought into correspondence with the sample identified by the retrieval means.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×