Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains
First Claim
1. A concatenative speech synthesizer, comprising:
- a database containing (a) demi-syllable waveform data associated with a plurality of demi-syllables and (b) filter parameter data associated with said plurality of demi-syllables;
a unit selection system for extracting selected demi-syllable waveform data and filter parameters form said database that correspond to an input string to be synthesized;
a waveform cross fade mechanism for joining pairs of extracted demi-syllable waveform data into syllable waveform signals;
a filter parameter cross fade mechanism for defining a set of syllable-level filter data by interpolating said extracted filter parameters; and
a filter module receptive of said set of syllable-level filter data and operative to process said syllable waveform signals to generate synthesized speech.
1 Assignment
0 Petitions
Accused Products
Abstract
The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross face techniques, one applied in the time domain in the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.
17 Citations
15 Claims
-
1. A concatenative speech synthesizer, comprising:
-
a database containing (a) demi-syllable waveform data associated with a plurality of demi-syllables and (b) filter parameter data associated with said plurality of demi-syllables;
a unit selection system for extracting selected demi-syllable waveform data and filter parameters form said database that correspond to an input string to be synthesized;
a waveform cross fade mechanism for joining pairs of extracted demi-syllable waveform data into syllable waveform signals;
a filter parameter cross fade mechanism for defining a set of syllable-level filter data by interpolating said extracted filter parameters; and
a filter module receptive of said set of syllable-level filter data and operative to process said syllable waveform signals to generate synthesized speech. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A concatenative speech synthesizer, comprising:
-
a database containing concatenation waveform data and filter parameter data that correlates to a plurality of concatenation units which represent human speech;
a unit selection system for selecting and extracting concatenation waveform data and filter parameter data from said database that correlates to an input string to be synthesized;
a waveform cross fade mechanism for joining the extracted concatenation waveform data into a concatenated waveform signal;
a filter parameter cross fade mechanism for defining a sequence of concatenated filter data from the extracted filter parameter data; and
a filter module receptive of said sequence of concatenated filter data and operative to process said concatenated waveform signal to generate synthesized speech. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
-
Specification