Text to speech system and method having interactive spelling capabilities
First Claim
Patent Images
1. A text-to-speech (TTS) system, comprising:
- a memory operable to store a text file and an audio file; and
a TTS module operable to;
convert a plurality of textual words in the text file to a plurality of audible words;
store the audible words in an audio file, the audio file including a plurality of electronic markers embedded in the audio file; and
store for each audible word;
a first location locating the audible word in the audio file; and
a second location locating the corresponding textual word in the text file; and
transmit the audible words to a telecommunication device operable to play the audio file to a user;
an output device operable to play the audio file to a user;
an interface operable to receive a voice command to spell one of the audible words during the playing of the audio file; and
a processor operable to;
remove the electronic markers from the audio file during playback;
track the number of words played by counting the number of electronic markers removed;
determine the textual word corresponding to the audible word to be spelled; and
audibly spell the textual word.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for audibly spelling a word in an audio file includes playing an audio file to a user, receiving a command to spell a word in the audio file from the user, identifying a textual word in a text file corresponding to the word, and audibly spelling the textual word. A text-to-speech system includes a memory and a text-to-speech module. The text-to-speech module generates an audio file from a text file, and stores in the memory locations for words in the audio file corresponding to locations of words in the text file.
245 Citations
32 Claims
-
1. A text-to-speech (TTS) system, comprising:
-
a memory operable to store a text file and an audio file; and a TTS module operable to; convert a plurality of textual words in the text file to a plurality of audible words; store the audible words in an audio file, the audio file including a plurality of electronic markers embedded in the audio file; and store for each audible word; a first location locating the audible word in the audio file; and a second location locating the corresponding textual word in the text file; and transmit the audible words to a telecommunication device operable to play the audio file to a user; an output device operable to play the audio file to a user; an interface operable to receive a voice command to spell one of the audible words during the playing of the audio file; and a processor operable to; remove the electronic markers from the audio file during playback; track the number of words played by counting the number of electronic markers removed; determine the textual word corresponding to the audible word to be spelled; and audibly spell the textual word.
-
-
2. A method for relating words in an audio file to words in a text file, comprising:
-
retrieving a text file comprising a textual word; converting the textual word to an audible word; storing the audible word in an audio file, the audio file including a plurality of electronic markers embedded in the audio file; storing a file map, the file map comprising; a first location locating the audible word within the audio file; and a second location locating the textual word within the text file; and transmitting the audio file to a telecommunication device operable to play the audio file to a user; removing the electronic markers from the audio file during playback; tracking the number of words played by counting the number of electronic markers removed; receiving a voice command from a user to spell the audible word; determining that the textual word corresponds to the audible word; and audibly spelling the textual word. - View Dependent Claims (3)
-
-
4. A method for relating words in an audio file to words in a text file, comprising:
-
retrieving a text file comprising a plurality of textual words; converting the plurality of textual words to a plurality of audible words, each audible word comprising media stream packets; storing information relating each audible word to a corresponding textual word, wherein the information comprises a plurality of electronic markers embedded in the audio file; transmitting the audible words to a telecommunication device associated with a user in real time as the audible words are generated; removing the electronic markers from the audio file during playback; tracking the number of words played by counting the number of electronic markers removed; during the playing of the audible words, determining a current textual word corresponding to the audible word currently being played. - View Dependent Claims (5, 6, 7)
-
-
8. A method for relating words in an audio file to words in a text file, comprising:
-
retrieving a text file comprising a textual word; converting the textual word to an audible word, the audible word comprising media stream packets; storing an identifier for the textual word; repeating the steps of the method for a plurality of textual words in the text file to generate an audio file of a plurality of audible words; storing information relating each audible word to a corresponding textual word, wherein the information comprises a plurality of electronic markers embedded in the audio file; transmitting the audio file to a telecommunication device operable to play the audio word to a user; removing the electronic markers from the audio file during playback; and tracking the number of words played by counting the number of electronic markers removed. - View Dependent Claims (9, 10)
-
-
11. A method for audibly spelling a word in an audio file, comprising:
-
retrieving a text file comprising a textual word; converting the textual word to an audible word, the audible word comprising media stream packets; storing the audible word in an audio file, the audio file comprising a plurality of audible words converted from a plurality of textual words and a plurality of electronic markers embedded in the audio file; playing the audio file to a user; removing the electronic markers from the audio file during playback; tracking the number of words played by counting the number of electronic markers removed; receiving from the user a voice command to spell an audible word in the audio file; in response to the voice command, using the number of electronic markers removed to identify in a text file a textual word corresponding to the audible word; and audibly spelling the textual word. - View Dependent Claims (12, 13, 14)
-
-
15. An interactive voice response server (IVR), comprising:
-
an interface operable to; play an audio file to a user, the audio file comprising a plurality of audible words converted from a plurality of textual words and a plurality of electronic markers embedded in the audio file; and receive a voice command to spell an audible word in the audio file from the user; and a processor operable to; remove the electronic markers from the audio file during playback; track the number of words played by counting the number of electronic markers removed; identify an audible word to be spelled in response to the voice command to spell; in response to the voice command, identify a textual word in a text file corresponding to the audible word to be spelled; and audibly spell the textual word. - View Dependent Claims (16, 17, 18)
-
-
19. A computer readable medium encoded with logic capable of being executed by a processor to perform the steps of:
-
retrieving a text file comprising a textual word; converting the textual word to an audible word, the audible word comprising media stream packets; playing an audio file to a user, the audio file comprising a plurality of audible words converted from a plurality of textual words and a plurality of electronic markers embedded in the audio file; removing the electronic markers from the audio file during playback; tracking the number of words played by counting the number of electronic markers removed; receiving from the user a voice command to spell an audible word in the audio file; in response to the voice command, identifying in a text file a textual word corresponding to the audible word; and audibly spelling the textual word. - View Dependent Claims (20, 21, 22)
-
-
23. A computer readable medium encoded with logic capable of being executed by a processor to perform the steps of:
-
selecting a textual word in a text file; converting the textual word to an audible word; storing the audible word in an audio file; storing a file map, the file map comprising; a first location locating the audible word within the audio file; and a second location locating the textual word within the text file; and transmitting the audio file to a telecommunication device operable to play the audio file to a user; removing the electronic markers from the audio file during playback; tracking the number of words played by counting the number of electronic markers removed; receiving a voice command from a user to spell the audible word; determining that the textual word corresponds to the audible word; and audibly spelling the textual word. - View Dependent Claims (24)
-
-
25. A method for synchronizing audible words with textual words in a text file, comprising:
-
retrieving a text file comprising a plurality of textual words; generating a plurality of audio files by converting the plurality of textual words to a plurality of audible words, each audio file comprising an audible word corresponding to one of the textual words; for each audio file, storing information relating the audio file to the corresponding textual word, the information comprising an electronic marker within the audio file that indicates the position of the audible word within the text file user; removing the electronic markers from the audio file during playback; and tracking the number of words played by counting the number of electronic markers removed. - View Dependent Claims (26)
-
-
27. A system for spelling words in an audio file, comprising:
-
means for playing an audio file to a user, the audio file comprising a plurality of audible words converted from a plurality of textual words; means for removing the electronic markers from the audio file during playback; means for tracking the number of words played by counting the number of electronic markers removed; means for receiving from the user a voice command to spell an audible word in the audio file; means for identifying in a text file a textual word corresponding to the audible word in response to the voice command; and means for audibly spelling the textual word.
-
-
28. A method for relating words in an audio file to words in a text file, comprising:
-
retrieving a text file comprising a plurality of textual words; generating an audio file by converting the plurality of textual words to a plurality of audible words; storing information relating each audible word to a corresponding textual word, wherein the information comprises a plurality of electronic markers embedded in the audio file; transmitting the audio file to a telecommunication device operable to play the audio file to a user; removing the electronic markers from the audio file during playback; and tracking the number of words played by counting the number of electronic markers removed. - View Dependent Claims (29, 30, 31, 32)
-
Specification