Utilization of multiple voice sources in a speech synthesizer
First Claim
Patent Images
1. A synthetic text-to-speech generating method comprising:
- generating a set of speech synthesizer control parameters representative of text to be spoken; and
converting the speech synthesizer control parameters into output wave forms representative of the synthetic speech to be spoken by selecting and combining at least two voice sources from a multiplicity of voice sources in a speech synthesizer to generate a combined voice source and by passing the combined voice sottree through an acoustic model of a human vocal tract.
0 Assignments
0 Petitions
Accused Products
Abstract
Utilization of one or more voice sources in a speech synthesizer to provide improved synthetic speech. Having a speech synthesizer with the capability to select among and between a multiplicity of voice sources provides a higher quality and greater variety of possible synthetic speech sounds. This is particularly true when the multiplicity of voice sources are predetermined to have particular speech qualities and spectral content such as may be desired to convey emotional vocal content in synthetic speech.
-
Citations
21 Claims
-
1. A synthetic text-to-speech generating method comprising:
-
generating a set of speech synthesizer control parameters representative of text to be spoken; and converting the speech synthesizer control parameters into output wave forms representative of the synthetic speech to be spoken by selecting and combining at least two voice sources from a multiplicity of voice sources in a speech synthesizer to generate a combined voice source and by passing the combined voice sottree through an acoustic model of a human vocal tract. - View Dependent Claims (2, 3, 4)
-
-
5. An apparatus for generating synthetic text-to-speech, the apparatus comprising:
-
means for generating a set of speech synthesizer control parameters representative of text to be spoken; and means for converting the speech synthesizer control parameters into output wave forms representative of the synthetic speech to be spoken by means for selecting and combining at least two voice sources from a multiplicity of voice sources in a speech synthesizer to generate a combined voice source and means for passing the combined voice source through an acoustic model of a human vocal tract. - View Dependent Claims (6, 7, 8)
-
-
9. A method of generating synthetic speech in a synthetic speech system comprising a speech synthesizer, said synthetic speech generating method comprising the steps of:
-
a) providing a multiplicity of synthetic voice sources to said speech synthesizer; b) providing a set of speech synthesizer control parameters to said speech synthesizer; c) said speech synthesizer selecting at least two of said multiplicity of voice sources based upon said set of speech synthesizer control parameters; d) said speech synthesizer combining the Selected voice sources to generate a combined voice source; and e) generating said synthetic speech based upon said set of speech synthesizer control parameters and using said combined voice source. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A text-to-speech synthesizer system for generating a synthetic speech signal, the synthesizer system comprising:
-
a phonetic translation of text to be spoken by the text-to-speech synthesizer system; a multiplicity of audio signals to be used as voice sources by the text-to-speech synthesizer system; and an acoustic model of a human vocal tract, the acoustic model selectively receiving as input at least two of the multiplicity of audio signals and the phonetic translation, the acoustic model acoustically modifying the received audio signals based upon the phonetic translation to generate a modified voice source, and the acoustic model outputting the modified voice source as the synthetic speech signal.
-
-
15. A parametric synthetic text-to-speech system comprising:
-
a memory containing a multiplicity of digitally sampled voice sources and a set of text-to-speech parameters indicative of text to be spoken by the synthetic text-to-speech system; a filter network for modulating two or more of the multiplicity of voice sources in accordance with the set of text-to-speech parameters to generate a modulated voice source, the filter network modeling the acoustic aspects of the human vocal tract; a loudspeaker for generating a waveform of the synthetic speech utilizing the modulated voice source.
-
-
16. A text-to-speech synthesizer system for generating a synthetic speech signal, the synthesizer system comprising:
-
a phonetic translation of text to be spoken by the text-to-speech synthesizer system; two audio signals to be used as voice sources by the text-to-speech synthesizer system; an acoustic model of a human vocal tract for receiving the two audio signals and the phonetic translation, combining and modifying the two audio signals based upon the phonetic translation, and outputting the combined and modified two audio signals as the synthetic speech signal. - View Dependent Claims (17, 18, 19, 20, 21)
-
Specification