Text to speech synthesis
First Claim
1. A method for converting an input linguistic description into a speech waveform comprising:
- deriving at least one target unit sequence corresponding to the linguistic description;
selecting from a waveform unit database a plurality of alternative unit sequences approximating the at least one target unit sequence;
concatenating the alternative unit sequences to alternative speech waveforms; and
presenting the alternative speech waveforms to an operating person and enabling the choice of one of the presented alternative speech waveforms.
8 Assignments
0 Petitions
Accused Products
Abstract
An input linguistic description is converted into a speech waveform by deriving at least one target unit sequence corresponding to the linguistic description, selecting from a waveform unit database for the target unit sequences a plurality of alternative unit sequences approximating the target unit sequences, concatenating the alternative unit sequences to alternative speech waveforms and presenting the alternative speech waveforms to an operating person and enabling the choice of one of the presented alternative speech waveforms. There are no iterative cycles of manual modification and automatic selection, which enables a fast way of working. The operator does not need knowledge of units, targets, and costs, but chooses from a set of given alternatives. The fine-tuning of TTS prompts therefore becomes accessible to non-experts.
288 Citations
18 Claims
-
1. A method for converting an input linguistic description into a speech waveform comprising:
-
deriving at least one target unit sequence corresponding to the linguistic description; selecting from a waveform unit database a plurality of alternative unit sequences approximating the at least one target unit sequence; concatenating the alternative unit sequences to alternative speech waveforms; and presenting the alternative speech waveforms to an operating person and enabling the choice of one of the presented alternative speech waveforms. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A text to speech processor for converting an input linguistic description into a speech waveform, said processor comprising:
-
deriving means for deriving at least one target unit sequence corresponding to the linguistic description; selection means for selecting from a waveform unit database a plurality of alternative unit sequences approximating the at least one target unit sequence; concatenating means for concatenating the alternative unit sequences to alternative speech waveforms; and means for presenting the alternative speech waveforms to an operating person and enabling the choice of one of the presented alternative speech waveforms. - View Dependent Claims (18)
-
Specification