Method and apparatus for combining text to speech and recorded prompts
First Claim
Patent Images
1. A method comprising:
- receiving a text message for conversion to speech, the text message having a tagged portion and a non-tagged portion;
identifying a topic domain associated with the text message;
selecting, via a text-to-speech device, first phonemes from a phoneme database for the non-tagged portion based on first speech-related characteristics, wherein the phoneme database is specific to the topic domain and comprises phonemes labeled by database tags;
generating first speech synthesis rules for the non-tagged portion based on the first speech-related characteristics;
selecting second phonemes from the phoneme database based on second speech-related characteristics as indicated by message tags in the tagged portion of the text message, wherein the selecting is based on a matching of the message tags and the database tags, wherein the first phonemes and the second phonemes do not represent pre-recorded speech;
retrieving second speech synthesis rules for the tagged portion based on the second speech-related characteristics; and
synthesizing, via the text-to-speech device, speech by combining the first phonemes and the second phonemes using the first speech synthesis rules and the second speech synthesis rules.
10 Assignments
0 Petitions
Accused Products
Abstract
An arrangement provides for improved synthesis of speech arising from a message text. The arrangement stores prerecorded prompts and speech related characteristics for those prompts. A message is parsed to determine if any message portions have been recorded previously. If so then speech related characteristics for those portions are retrieved. The arrangement generates speech related characteristics for those parties not previously stored. The retrieved and generated characteristics are combined. The combination of characteristics is then used as the input to a speech synthesizer.
-
Citations
9 Claims
-
1. A method comprising:
-
receiving a text message for conversion to speech, the text message having a tagged portion and a non-tagged portion; identifying a topic domain associated with the text message; selecting, via a text-to-speech device, first phonemes from a phoneme database for the non-tagged portion based on first speech-related characteristics, wherein the phoneme database is specific to the topic domain and comprises phonemes labeled by database tags; generating first speech synthesis rules for the non-tagged portion based on the first speech-related characteristics; selecting second phonemes from the phoneme database based on second speech-related characteristics as indicated by message tags in the tagged portion of the text message, wherein the selecting is based on a matching of the message tags and the database tags, wherein the first phonemes and the second phonemes do not represent pre-recorded speech; retrieving second speech synthesis rules for the tagged portion based on the second speech-related characteristics; and synthesizing, via the text-to-speech device, speech by combining the first phonemes and the second phonemes using the first speech synthesis rules and the second speech synthesis rules. - View Dependent Claims (2, 3)
-
-
4. An text-to-speech device having instructions stored which, when executed, cause the text-to-speech device to perform operations comprising:
-
receiving a text message for conversion to speech, the text message having a tagged portion comprising message tags and a non-tagged portion; identifying a topic domain associated with the text message; generating first speech synthesis rules for the non-tagged portion; retrieving second speech synthesis rules for the tagged portion; retrieving first phonemes from a phoneme database for the non-tagged portion of the text message; retrieving second phonemes from the phoneme database for the tagged-portion of the text message, wherein the phoneme database is specific to the topic domain and comprises phonemes labeled by database tags, wherein the retrieving of the first phonemes and the second phonemes is based on a matching of the message tags and the database tags, and wherein the first phonemes and the second phonemes do not represent pre-recorded speech; and combining the first phonemes and the second phonemes to output an audible version of the text message using the first speech synthesis rules and the second speech synthesis rules. - View Dependent Claims (5, 6)
-
-
7. A method comprising:
-
receiving text to be converted to speech, the text having a tagged portion and a non-tagged portion; identifying, via a text-to-speech device, a topic domain associated with the text; for the non-tagged portion of the text, retrieving first phonemes from a phoneme database having first speech related characteristics, wherein the phoneme database is specific to the topic domain and comprises phonemes labeled by database tags; generating first speech synthesis rules for the non-tagged portion based on the first speech-related characteristics; for the tagged portion of the text, retrieving second phonemes from the database, the second phonemes having second speech related characteristics as indicated by message tags associated with the tagged portion, and wherein the retrieving is based on a matching of the message tags and the database tags wherein the first and the second phonemes do not represent pre-recorded speech; retrieving second speech synthesis rules for the tagged portion based on the second speech-related characteristics; and synthesizing, via the text-to-speech device, speech based on the text by combining the first phonemes and the second phonemes using the first speech synthesis rules and the second speech synthesis rules. - View Dependent Claims (8, 9)
-
Specification