System and method for converting text-to-voice
First Claim
1. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising:
- generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone;
wherein the ending sonic feature of the first recording is a tone and the starting sonic feature of the second recording is a tone, and wherein synchronizing the first recording switch point and the second recording switch point further includes synchronizing the tones, and switching on peaks of the tones; and
wherein the recordings overlap, and wherein synchronizing during the overlap includes multiplexing.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules is provided. The method comprises generating voice data based on a sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings. Concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point.
-
Citations
8 Claims
-
1. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising:
-
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone;
wherein the ending sonic feature of the first recording is a tone and the starting sonic feature of the second recording is a tone, and wherein synchronizing the first recording switch point and the second recording switch point further includes synchronizing the tones, and switching on peaks of the tones; and
wherein the recordings overlap, and wherein synchronizing during the overlap includes multiplexing.
-
-
2. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising:
-
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone; and
wherein the ending sonic feature of the first recording is a noise and the starting sonic feature of the second recording is a noise, and wherein synchronizing the first recording switch point and the second recording switch point includes switching anywhere within the noise such that not more than fifty percent of duration of either noises is cut.
-
-
3. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising:
-
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone;
wherein the ending sonic feature of the first recording is a tone and the starting sonic feature of the second recording is an impulse, and wherein synchronizing the first recording switch point and the second recording switch point further includes switching on a peak of the tone and on an impulse of the impulse; and wherein the tone and the impulse overlap, and wherein synchronizing during the overlap includes multiplexing.
-
-
4. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising:
-
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone; and
wherein the ending sonic feature of the first recording is a noise and the starting sonic feature of the second recording is an impulse, and wherein synchronizing the first recording switch point and the second recording switch point further includes switching anywhere within the noise such that not more than fifty percent of the noise is cut, and switching on an impulse of the impulse.
-
-
5. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising:
-
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone; and
wherein the ending sonic feature of the first recording is a noise and the starting sonic feature of the second recording is an tone, and wherein synchronizing the first recording switch point and the second recording switch point further includes switching anywhere within the noise such that not more than fifty percent of the noise is cut, and switching on a peak of the tone.
-
-
6. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising;
-
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone;
wherein the ending sonic feature of the first recording is an impulse and the starting sonic feature of the second recording is a tone, and wherein synchronizing the first recording switch point and the second recording switch point further includes switching at a peak of the tone and an end of the impulse; and wherein the impulse and the tone overlap, and wherein synchronizing during the overlap includes multiplexing.
-
-
7. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising:
-
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone; and
wherein the ending sonic feature of the first recording is an impulse and the starting sonic feature of the second recording is a noise, and wherein synchronizing the first recording switch point and the second recording switch point further includes switching anywhere within the noise such that not more than fifty percent of duration of the noise is cut, and switching an end of the impulse.
-
-
8. A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules, the digital voice library including a plurality of voice recordings with each recording having a starting sonic feature and an ending sonic feature, the method including receiving text data, converting the text data into a sequence of voice recordings in accordance with the digital voice library and the set of playback rules, the method further comprising:
-
generating voice data based on the sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings, wherein concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point;
wherein the starting and ending sonic features of the voice recordings are classified into a number of different categories including a noise, an impulse, and a tone; and
wherein the ending sonic feature of the first recording is an tone and the starting sonic feature of the second recording is a noise, and wherein synchronizing the first recording switch point and the second recording switch point further includes switching anywhere within the noise such that not more than fifty percent of duration of the noise is cut, and switching at a peak of the tone.
-
Specification