Generating paralinguistic phenomena via markup in text-to-speech synthesis
First Claim
1. A method of converting marked-up text into a synthesized stream, comprising:
- providing marked-up text to a processor-based system;
converting the marked-up text into a text stream comprising a plurality of vocabulary items;
retrieving a plurality audio segments corresponding to the plurality of vocabulary items;
concatenating the plurality of audio segments to form a synthesized stream; and
audibly outputting the synthesized stream;
wherein the marked-up text comprises a normal text and a paralinguistic text;
wherein the normal text is differentiated from the paralinguistic text by using a grammar constraint; and
wherein the paralinguistic text is associated with more than one audio segment, wherein the retrieving of the plurality audio segments comprises selecting one audio segment associated with the paralinguistic text.
3 Assignments
0 Petitions
Accused Products
Abstract
Converting marked-up text into a synthesized stream includes providing marked-up text to a processor-based system, converting the marked-up text into a text stream including vocabulary items, retrieving audio segments corresponding to the vocabulary items, concatenating the audio segments to form a synthesized stream, and audibly outputting the synthesized stream, wherein the marked-up text includes a normal text and a paralinguistic text; and wherein the normal text is differentiated from the paralinguistic text by using a grammar constraint, and wherein the paralinguistic text is associated with more than one audio segment, wherein the retrieving of the plurality audio segments includes selecting one audio segment associated with the paralinguistic text.
-
Citations
25 Claims
-
1. A method of converting marked-up text into a synthesized stream, comprising:
-
providing marked-up text to a processor-based system; converting the marked-up text into a text stream comprising a plurality of vocabulary items; retrieving a plurality audio segments corresponding to the plurality of vocabulary items; concatenating the plurality of audio segments to form a synthesized stream; and audibly outputting the synthesized stream; wherein the marked-up text comprises a normal text and a paralinguistic text; wherein the normal text is differentiated from the paralinguistic text by using a grammar constraint; and wherein the paralinguistic text is associated with more than one audio segment, wherein the retrieving of the plurality audio segments comprises selecting one audio segment associated with the paralinguistic text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of converting paralinguistic text into a synthesized stream, comprising:
-
providing paralinguistic text to a processor-based system; converting the paralinguistic into a text stream comprising a plurality of vocabulary items; retrieving a plurality of audio examples corresponding to the plurality of vocabulary items; concatenating the plurality of audio examples to form a synthesized stream; and audibly outputting the synthesized stream; wherein the paralinguistic text comprise non-speech sounds indicating an emotional state underlying the paralinguistic text; and wherein the paralinguistic text is associated with more than one audio segment, wherein the retrieving of the plurality audio segments comprises selecting one audio segment associated with the paralinguistic text. - View Dependent Claims (13)
-
-
14. A system of converting marked-up text into a synthesized stream, comprising:
-
means for providing marked-up text to a processor-based system; means for converting the marked-up text into a text stream comprising a plurality of vocabulary items; means for retrieving a plurality of audio examples corresponding to the plurality of vocabulary items; means for concatenating the plurality of audio examples to form a synthesized stream; and
means for audibly outputting the synthesized stream;wherein the marked-up text comprises a normal text and a paralinguistic text; and wherein the normal text is differentiated from the paralinguistic text by using a grammar constraint; and wherein the paralinguistic text is associated with more than one audio segment, wherein the retrieving of the plurality audio segments comprises selecting one audio segment associated with the paralinguistic text. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for converting marked- up text into a synthesized stream, the method steps comprising:
-
providing marked-up text to a processor-based system; converting the marked-up text into a text stream comprising a plurality of vocabulary items; retrieving a plurality audio segments corresponding to the plurality of vocabulary items; concatenating the plurality of audio segments to form a synthesized stream; and audibly outputting the synthesized stream; wherein the marked-up text comprises a normal text and a paralinguistic text; wherein the normal text is differentiated from the paralinguistic text by using a grammar constraint; and wherein the paralinguistic text is associated with more than one audio segment, wherein the retrieving of the plurality audio segments comprises selecting one audio segment associated with the paralinguistic text.
-
-
25. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for converting paralinguistic text into a synthesized stream, the method steps comprising:
-
providing paralinguistic text to a processor-based system; converting the paralinguistic into a text stream comprising a plurality of vocabulary items; retrieving a plurality of audio examples corresponding to the plurality of vocabulary items; concatenating the plurality of audio examples to form a synthesized stream; and audibly outputting the synthesized stream; wherein the paralinguistic text comprise non-speech sounds indicating an emotional state underlying the paralinguistic text; and wherein the paralinguistic text is associated with more than one audio segment, wherein the retrieving of the plurality audio segments comprises selecting one audio segment associated with the paralinguistic text.
-
Specification