Customizing the speaking style of a speech synthesizer based on semantic analysis
First Claim
Patent Images
1. A method for generating synthesized speech, comprising:
- receiving a block of input text into a text-to-speech synthesizing system;
partitioning the block of input text into a plurality of context spaces each containing multiple phrases;
performing semantic analysis on each context space in order to identify a topic for each context space;
selecting a speaking style for each context space from a plurality of predefined speaking styles based on the topics identified respective of the context spaces, where each speaking style correlates to prosodic parameters and is associated with one or more anticipated topics;
converting the sentences to corresponding phoneme data;
applying prosodic parameters which correlate to the selected speaking style to the phoneme data, thereby generating a prosodic representation of the phoneme data; and
generating audible speech using the prosodic representation of the phoneme data.
4 Assignments
0 Petitions
Accused Products
Abstract
A method is provided for customizing the speaking style of a speech synthesizer. The method includes: receiving input text; determining semantic information for the input text; determining a speaking style for rendering the input text based on the semantic information; and customizing the audible speech output of the speech synthesizer based on the identified speaking style.
198 Citations
9 Claims
-
1. A method for generating synthesized speech, comprising:
-
receiving a block of input text into a text-to-speech synthesizing system; partitioning the block of input text into a plurality of context spaces each containing multiple phrases; performing semantic analysis on each context space in order to identify a topic for each context space; selecting a speaking style for each context space from a plurality of predefined speaking styles based on the topics identified respective of the context spaces, where each speaking style correlates to prosodic parameters and is associated with one or more anticipated topics; converting the sentences to corresponding phoneme data; applying prosodic parameters which correlate to the selected speaking style to the phoneme data, thereby generating a prosodic representation of the phoneme data; and generating audible speech using the prosodic representation of the phoneme data. - View Dependent Claims (2, 6, 7, 8)
-
-
3. A method for customizing the speaking style of a text-to-speech synthesizer system, comprising:
-
receiving a block of input text which; partitioning the block of input text into a plurality of context spaces each containing multiple phrases; determining semantic information for each context space selecting a speaking style for each context space from a plurality of predefined speaking styles based on the semantic information, where each speaking style correlates to prosodic parameters and is associated with one or more anticipated topics; and customizing an output parameter of a multimedia user interface of the text-to-speech synthesizer system based on the speaking style, where the text-to-speech synthesizer system is operable to render audible speech which correlates to the input text. - View Dependent Claims (4, 5)
-
-
9. A text-to-speech synthesizer system, comprising:
-
a text analyzer receptive of a block of input text and operable to partition the block of input text into a plurality of context spaces each containing multiple phrases and determine semantic information for each context space; a style selector adapted to receive semantic information from the text analyzer and operable to determine, for each context space, a speaking style for rendering the input text contained in that context space based on the semantic information, where the selected speaking style correlates to one or more prosodic attributes; a phonetic analyzer adapted to receive input text from the text analyzer and operable to convert the input text into corresponding phoneme data; a prosodic analyzer adapted to receive phoneme data from the phonetic analyzer and the prosodic attributes from the style selector, the prosodic analyzer further operable to apply the prosodic attributes to the phoneme data to form a prosodic representation of the phoneme data; and a speech synthesizer adapted to receive the prosodic representation of the phoneme data from the prosodic analyzer and operable to generate audible speech.
-
Specification