Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains

US RE39,336 E1
Filed: 11/05/2002
Issued: 10/10/2006
Est. Priority Date: 11/25/1998
Status: Expired due to Term

First Claim

Patent Images

1. A concatenative speech synthesizer, comprising:

a database containing (a) demi-syllable waveform data associated with a plurality of demi-syllables and (b) filter parameter data associated with said plurality of demi-syllables;

a unit selection system for extracting selected demi-syllable waveform data and filter parameters form said database that correspond to an input string to be synthesized;

a waveform cross fade mechanism for joining pairs of extracted demi-syllable waveform data into syllable waveform signals;

a filter parameter cross fade mechanism for defining a set of syllable-level filter data by interpolating said extracted filter parameters; and

a filter module receptive of said set of syllable-level filter data and operative to process said syllable waveform signals to generate synthesized speech.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross face techniques, one applied in the time domain in the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.

17 Citations

View as Search Results

15 Claims

1. A concatenative speech synthesizer, comprising:
- a database containing (a) demi-syllable waveform data associated with a plurality of demi-syllables and (b) filter parameter data associated with said plurality of demi-syllables;
  
  a unit selection system for extracting selected demi-syllable waveform data and filter parameters form said database that correspond to an input string to be synthesized;
  
  a waveform cross fade mechanism for joining pairs of extracted demi-syllable waveform data into syllable waveform signals;
  
  a filter parameter cross fade mechanism for defining a set of syllable-level filter data by interpolating said extracted filter parameters; and
  
  a filter module receptive of said set of syllable-level filter data and operative to process said syllable waveform signals to generate synthesized speech.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The synthesizer of claim 1 wherein said waveform cross fade mechanism operates in the time domain.
  - 3. The synthesizer of claim 1 wherein said filter parameter cross fade mechanism operates in the frequency domain.
  - 4. The synthesizer of claim 1 wherein said waveform cross fade mechanism performs a linear cross fade upon two demi-syllables over a predefined duration corresponding to a syllable.
  - 5. The synthesizer of claim 1 wherein said filter parameter cross fade mechanism interpolates between the respective extracted filter parameters of two demi-syllables.
  - 6. The synthesizer of claim 1 wherein said filter parameter cross fade mechanism performs linear interpolation between the respective extracted filter parameters of two demi-syllables.
  - 7. The synthesizer of claim 1 wherein said filter parameter cross fade mechanism performs sigmoidal interpolation between the respective extracted filter parameters of two demi-syllables.

8. A concatenative speech synthesizer, comprising:
- a database containing concatenation waveform data and filter parameter data that correlates to a plurality of concatenation units which represent human speech;
  
  a unit selection system for selecting and extracting concatenation waveform data and filter parameter data from said database that correlates to an input string to be synthesized;
  
  a waveform cross fade mechanism for joining the extracted concatenation waveform data into a concatenated waveform signal;
  
  a filter parameter cross fade mechanism for defining a sequence of concatenated filter data from the extracted filter parameter data; and
  
  a filter module receptive of said sequence of concatenated filter data and operative to process said concatenated waveform signal to generate synthesized speech.
- View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
- - 9. The concatenative speech synthesizer of claim 8 wherein the plurality of concatenation units are selected from the group consisting of phonemes, diphones, demi-syllables, and syllables.
  - 10. The synthesizer of claim 8 wherein said waveform cross fade mechanism operates in the time domain.
  - 11. The synthesizer of claim 8 wherein said filter parameter cross fade mechanism operates in the frequency domain.
  - 12. The synthesizer of claim 8 wherein said waveform cross fade mechanism performs a linear cross fade upon two concatenation units over a predefined duration.
  - 13. The synthesizer of claim 8 wherein said filter parameter cross fade mechanism interpolates between the respective extracted filter parameter data of two concatenation units.
  - 14. The synthesizer of claim 8 wherein said filter parameter cross fade mechanism performs linear interpolation between the respective extracted filter parameter data of two concatenation units.
  - 15. The synthesizer of claim 8 wherein said filter parameter cross fade mechanism performs sigmoidal interpolation between the respective extracted filter parameter data of two concatenation units.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Panasonic Intellectual Property Corporation of America (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Pearson, Steve, Kibre, Nicholas, Niedzielski, Nancy
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
NOLAN, DANIEL A

Application Number

US10/288,029
Time in Patent Office

1,435 Days
Field of Search

704/266, 704/270, 704/264, 704/258, 704/218, 704/219, 704/249, 704/267, 704/503, 704/200, 704/265, 704/262, 704/268
US Class Current

704/258
CPC Class Codes

G10L 13/07 Concatenation rules

Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

17 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links