Systems and methods for text-to-speech synthesis using spoken example
First Claim
1. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for speech synthesis, the method steps comprising:
- determining prosodic parameters of a spoken utterance;
automatically generating a marked-up text corresponding to the spoken utterance using the prosodic parameters; and
generating a synthetic waveform using the marked-up text.
8 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for speech synthesis and, in particular, text-to-speech systems and methods for converting a text input to a synthetic waveform by processing prosodic and phonetic content of a spoken example of the text input to accurately mimic the input speech style and pronunciation. Systems and methods provide an interface to a TTS system to allow a user to input a text string and a spoken utterance of the text string, extract prosodic parameters from the spoken input, and process the prosodic parameters to derive corresponding markup for the text input to enable a more natural sounding synthesized speech.
100 Citations
24 Claims
-
1. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for speech synthesis, the method steps comprising:
-
determining prosodic parameters of a spoken utterance;
automatically generating a marked-up text corresponding to the spoken utterance using the prosodic parameters; and
generating a synthetic waveform using the marked-up text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for speech synthesis, comprising the steps of:
-
determining prosodic parameters of a spoken utterance;
automatically generating a marked-up text corresponding to the spoken utterance using the prosodic parameters; and
generating a synthetic waveform using the marked-up text. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A text-to-speech (TTS) system, comprising:
-
a prosody analyzer for determining prosodic parameters of a spoken utterance and automatically generating a marked-up text corresponding to the spoken utterance using the prosodic parameters; and
a TTS system for generating a synthetic waveform using the marked-up text. - View Dependent Claims (22, 23, 24)
-
Specification