Interpolating between representative frame waveforms of a prediction error signal for speech synthesis
First Claim
1. A speech synthesis apparatus comprising:
- a memory for storing a plurality of typical waveforms corresponding to a plurality of frames, the typical waveforms each previously obtained by extracting in units of at least one frame from a prediction error signal formed in predetermined units;
a voiced speech source generator including an interpolation circuit for performing interpolation between the typical waveforms readout from said memory to obtain a plurality of interpolation signals each having at least one of an interpolation pitch period and a signal level which changes smoothly between the corresponding frames, and a superposing circuit for superposing the interpolation signals obtained by said interpolation circuit to form a voiced speech source signal;
an unvoiced speech source generator for generating an unvoiced speech source signal; and
vocal tract filter selectively driven by the voiced speech source signal outputted from said voiced speech source generator and the unvoiced speech source signal from said unvoiced speech source generator to generate synthetic speech.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech synthesis apparatus includes; a memory for storing a plurality of typical waveforms corresponding to a plurality of frames, the typical waveforms each previously obtained by extracting in units of at least one frame from a prediction error signal formed in predetermined units, a voiced speech source generator including an interpolation circuit for performing interpolation between the typical waveforms read out from the memory means to obtain a plurality of interpolation signals each having at least one of an interpolation pitch period and a signal level which changes smoothly between the corresponding frames, a superposition circuit for superposing the interpolation signals obtained by the interpolation circuit to form a voiced speech source signal, an unvoiced speech source generator for generating an unvoiced speech source signal, and a vocal tract filter selectively driven by the voiced speech source signal outputted from the voiced speech source generator and the unvoiced speech source signal from the unvoiced speech source generator to generate synthetic speech. Further, interpolation positions can be determined bases on the pitch period.
-
Citations
19 Claims
-
1. A speech synthesis apparatus comprising:
-
a memory for storing a plurality of typical waveforms corresponding to a plurality of frames, the typical waveforms each previously obtained by extracting in units of at least one frame from a prediction error signal formed in predetermined units; a voiced speech source generator including an interpolation circuit for performing interpolation between the typical waveforms readout from said memory to obtain a plurality of interpolation signals each having at least one of an interpolation pitch period and a signal level which changes smoothly between the corresponding frames, and a superposing circuit for superposing the interpolation signals obtained by said interpolation circuit to form a voiced speech source signal; an unvoiced speech source generator for generating an unvoiced speech source signal; and vocal tract filter selectively driven by the voiced speech source signal outputted from said voiced speech source generator and the unvoiced speech source signal from said unvoiced speech source generator to generate synthetic speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A speech synthesis apparatus comprising:
-
a typical waveform storage storing a plurality of typical waveforms each representative of individual frames of voiced speech source signals obtained by dividing a time-sequence signal into specific frame units and outputs a typical waveform selected according to waveform selection information given for each frame in accordance with a speech signal to be synthesized; an interpolation position determining circuit for determining the interpolation positions extending over two consecutive frames on the basis of the pitch period given in accordance with the speech signal to be synthesized; a waveform interpolation circuit for forming a plurality of voiced speech waveforms corresponding to the interpolation positions determined by said interpolation position determining circuit by performing interpolation to the typical waveforms corresponding to the two consecutive frames outputted from said typical waveform storage; a waveform superposing circuit for superposing the voiced speech source signal waveforms obtained by said waveform interpolation circuit and corresponding to the interpolation positions determined by said interpolation position determining circuit, to obtain a voiced speech source signal; and a vocal tract filter driven by said voiced speech source signal for generating synthetic speech.
-
-
11. A speech synthesis apparatus comprising:
-
a typical waveform storage for storing a plurality of typical waveforms each representative of individual frames of voiced speech source signals obtained by dividing a time-sequence signal into specific frame units and outputs a plurality of typical waveforms selected according to waveform selecting information given for each frame in accordance with a speech signal to be synthesized; a pitch interpolation circuit for interpolating a pitch period given to the typical waveforms so that the pitch periods corresponding to two consecutive frames change smoothly, on the basis of the pitch period given to the typical waveforms for each frame in accordance with the speech signal to be synthesized; an interpolation position determining circuit for determining the interpolation positions extending over two consecutive frames according to a plurality of interpolated pitch periods obtained by said pitch interpolation circuit; waveform processing means for arranging the typical waveforms readout from said typical waveform storage at the interpolation positions determined at said interpolation position determining circuit, to obtain a voiced speech source signal; and a vocal tract filter section driven by said voiced speech source signal for generating synthetic speech. - View Dependent Claims (12)
-
-
13. A speech synthesis method comprising the steps of:
-
preparing a plurality of prediction error signals corresponding to phonemes of plural frames; extracting a plurality of typical waveforms from the prediction error signals in predetermined units and storing the typical waveforms extracted in a storage; interpolating the typical waveforms corresponding to consecutive frames so that the pitch period and signal waveform change smoothly between the consecutive frames to obtain interpolation signals; forming a voiced speech source signal by superposing the interpolation signals; forming an unvoiced speech source signal; and forming a synthesis speech in accordance with the voiced source signals and the unvoiced speech source signals. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A speech synthesis system, comprising:
-
means for preparing a plurality of prediction error signals corresponding to phonemes of plural frames; means for extracting a plurality of typical waveforms from the prediction error signals in predetermined units and storing the typical waveforms extracted in a memory; means for interpolating the typical waveforms corresponding to consecutive frames so that the pitch period and signal waveforms change smoothly between the consecutive frames to obtain interpolation signals; means for forming a voiced speech source signal by superposing the interpolation signals; forming an unvoiced speech source signal; and forming a synthesis speech in accordance with the voiced source signals and the unvoiced speech source signals.
-
Specification