Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains
First Claim
1. A concatenative speech synthesizer, comprising:
- a database containing (a) demi-syllable waveform data associated with a plurality of demi-syllables and (b) filter parameter data associated with said plurality of demi-syllables;
a unit selection system for extracting selected demi-syllable waveform data and filter parameters from said database that correspond to an input string to be synthesized;
a waveform cross fade mechanism for joining pairs of extracted demi-syllable waveform data into syllable waveform signals;
a filter parameter cross fade mechanism for defining a set of syllable-level filter data by interpolating said extracted filter parameters; and
a filter module receptive of said set of syllable-level filter data and operative to process said syllable waveform signals to generate synthesized speech.
1 Assignment
0 Petitions
Accused Products
Abstract
The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross fade techniques, one applied in the time domain to the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.
198 Citations
7 Claims
-
1. A concatenative speech synthesizer, comprising:
-
a database containing (a) demi-syllable waveform data associated with a plurality of demi-syllables and (b) filter parameter data associated with said plurality of demi-syllables; a unit selection system for extracting selected demi-syllable waveform data and filter parameters from said database that correspond to an input string to be synthesized; a waveform cross fade mechanism for joining pairs of extracted demi-syllable waveform data into syllable waveform signals; a filter parameter cross fade mechanism for defining a set of syllable-level filter data by interpolating said extracted filter parameters; and a filter module receptive of said set of syllable-level filter data and operative to process said syllable waveform signals to generate synthesized speech. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification