Method and apparatus for automatic generation of vocal emotion in a synthetic text-to-speech system
First Claim
Patent Images
1. A method for automatic application of vocal emotion to previously entered text to be outputted by a synthetic text-to-speech system, said method comprising:
- selecting a portion of said previously entered text;
manipulating a visual appearance of the selected text to selectively choose a vocal emotion to be applied to said selected text;
obtaining vocal emotion parameters associated with said selected vocal emotion; and
applying said obtained vocal emotion parameters to said selected text to be outputted by said synthetic text-to-speech system.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for the automatic application of vocal emotion parameters to text in a text-to-speech system. Predefining vocal parameters for various vocal emotions allows simple selection and application of vocal emotions to text to be output from a text-to-speech system. Further, the present invention is capable of generating vocal emotion with the limited prosodic controls available in a concatenative synthesizer.
-
Citations
28 Claims
-
1. A method for automatic application of vocal emotion to previously entered text to be outputted by a synthetic text-to-speech system, said method comprising:
-
selecting a portion of said previously entered text; manipulating a visual appearance of the selected text to selectively choose a vocal emotion to be applied to said selected text; obtaining vocal emotion parameters associated with said selected vocal emotion; and applying said obtained vocal emotion parameters to said selected text to be outputted by said synthetic text-to-speech system. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for providing vocal emotion to previously entered text in a concatenative synthetic text-to-speech system, said method comprising:
-
selecting said previously entered text; manipulating a visual appearance of the selected text to select a vocal emotion from a set of vocal emotions; obtaining vocal emotion parameters predetermined to be associated with said selected vocal emotion, said vocal emotion parameters specifying pitch mean, pitch range, volume and speaking rate; applying said obtained vocal emotion parameters to said selected text; and synthesizing speech from the selected text. - View Dependent Claims (7)
-
-
8. An apparatus for automatic application of vocal emotion parameters to previously entered text to be outputted by a synthetic text-to-speech system, said apparatus comprising:
-
a display device for displaying said previously entered text; an input device for permitting a user to selectively manipulate a visual appearance of the entered text and thereby select a vocal emotion; memory for holding said vocal emotion parameters associated with said selected vocal emotion; and logic circuitry for obtaining said vocal emotion parameters associated with said selected vocal emotion from said memory and for applying said obtained vocal emotion parameters to the manipulated text to be outputted by said synthetic text-to-speech system. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A method for converting text to speech that enables a user to interactively apply vocal parameters to user-selectable text, comprising the steps of:
-
selecting a portion of visually displayed text; selectively manipulating the selected portion of text to modify a visual appearance of the selected portion of text and to modify certain vocal parameters associated with the selected portion of text; and applying the modified vocal parameters associated with the selected portion of text to synthesize speech from the modified text. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A computer-readable storage medium storing program code for causing a computer to perform the steps of:
-
permitting a user to select a portion of text; permitting a user to manipulate the selected text with a plurality of user-manipulatable control means; responding to each user-manipulation of one of said control means by modifying a plurality of corresponding vocal parameters of the selected text and modifying a displayed appearance of said portion of text; and synthesizing speech from the modified text.
-
-
26. A system for converting text to speech that enables a user to interactively apply vocal parameters to user-selectable text, comprising:
-
means for a user to select a portion of text; a plurality of interactive user manipulatable means for controlling vocal parameters associated with the selected portion of text; means, responsive to said control means, for modifying a plurality of vocal parameters associated with the portion of text and for modifying a displayed appearance of said portion of text; and means for synthesizing speech from the modified text.
-
-
27. A method of converting text to speech, comprising:
-
entering text; displaying a portion of the entered text; selecting a portion of the displayed text; manipulating an appearance of the selected text to selectively change a set of vocal emotion parameters associated with the selected text; and synthesizing speech having a vocal emotion from the manipulated portion of text; whereby the vocal emotion of the synthesized speech depends on the manner in which the appearance of the text is manipulated. - View Dependent Claims (28)
-
Specification