System and method for converting text-to-voice
First Claim
1. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of speech items and a corresponding plurality of voice recordings wherein each speech item corresponds to at least one available voice recording, multiple voice recordings corresponding to a single speech item representing various inflections of that single speech item, the method comprising:
- establishing multiple voice recordings in the digital voice library that correspond to a single inflection of a single speech item, for a plurality of inflections of a plurality of speech items, that represent various ligatures for the single inflection of the single speech item with adjacent speech items wherein the recordings for a single inflection of a single speech item are a limited set of recordings that represent a limited set of ligatures with adjacent speech items including only recordings having a vowel at either end and recordings having no surrounding ligature distortions;
receiving text data;
expanding the text data to form a sequence of text and pseudo words;
converting the sequence of text and pseudo words into a sequence of speech items in accordance with the digital voice library, wherein at least one speech item in the sequence of speech items corresponds to multiple voice recordings;
converting the sequence of speech items into a sequence of voice recordings in accordance with the set of playback rules, wherein selecting a voice recording where multiple voice recordings are available for a speech item is based on context around the speech item in the text data;
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings;
wherein the plurality of speech items includes a plurality of phrases, and wherein converting the sequences of text and pseudo words further includes parsing the sequence of text and pseudo words to determine any phrases.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules is provided. The method includes receiving and expanding text data to form a sequence of text and pseudo words. The sequence of text and pseudo words is converted into a sequence of speech items, and the sequence of speech items is converted into a sequence of voice recordings. The method includes generating voice data on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings.
37 Citations
9 Claims
-
1. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of speech items and a corresponding plurality of voice recordings wherein each speech item corresponds to at least one available voice recording, multiple voice recordings corresponding to a single speech item representing various inflections of that single speech item, the method comprising:
-
establishing multiple voice recordings in the digital voice library that correspond to a single inflection of a single speech item, for a plurality of inflections of a plurality of speech items, that represent various ligatures for the single inflection of the single speech item with adjacent speech items wherein the recordings for a single inflection of a single speech item are a limited set of recordings that represent a limited set of ligatures with adjacent speech items including only recordings having a vowel at either end and recordings having no surrounding ligature distortions; receiving text data; expanding the text data to form a sequence of text and pseudo words; converting the sequence of text and pseudo words into a sequence of speech items in accordance with the digital voice library, wherein at least one speech item in the sequence of speech items corresponds to multiple voice recordings; converting the sequence of speech items into a sequence of voice recordings in accordance with the set of playback rules, wherein selecting a voice recording where multiple voice recordings are available for a speech item is based on context around the speech item in the text data; generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings; wherein the plurality of speech items includes a plurality of phrases, and wherein converting the sequences of text and pseudo words further includes parsing the sequence of text and pseudo words to determine any phrases. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
Specification