Method and system for providing audio playback of a multi-source document
First Claim
Patent Images
1. A method for linking audio to text in a multi-source input and playback system, said method comprising the steps of:
- dictating one or more words;
transcribing the one or more words to form a first text set within a document;
storing the first text set on a storage medium;
comparing each audio element of a stored audio version of the one or more words with each corresponding text element of the first text set;
inserting second text into the document, wherein the second text is non-audio text;
associating a text-to-speech entry with the second text; and
forming a continuous stream of audio from (1) stored audio data corresponding to the first text set and (2) the text-to-speech entry corresponding to the second text.
2 Assignments
0 Petitions
Accused Products
Abstract
A multi-source input and playback utility that accepts inputs from various sources, transcribes the inputs as text, and plays aloud user-selected portions of the text is disclosed. The user may select a portion of the text and request audio playback thereof. The utility examines each transcribed word in the selected text. If stored audio data is associated with a given word, that audio data is retrieved and played. If no audio data is associated, then a textto-speech entry or series of entries is retrieved and played instead.
139 Citations
30 Claims
-
1. A method for linking audio to text in a multi-source input and playback system, said method comprising the steps of:
-
dictating one or more words;
transcribing the one or more words to form a first text set within a document;
storing the first text set on a storage medium;
comparing each audio element of a stored audio version of the one or more words with each corresponding text element of the first text set;
inserting second text into the document, wherein the second text is non-audio text;
associating a text-to-speech entry with the second text; and
forming a continuous stream of audio from (1) stored audio data corresponding to the first text set and (2) the text-to-speech entry corresponding to the second text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented method for creating and vocalizing a document, comprising the steps of:
-
speaking one or more words into an input device;
transcribing the one or more words as a first text entry within a document;
storing the one or more words on a storage-medium;
comparing each word of the one or more words with each word of said first text entry;
inputting a second text entry within the document, wherein the step of inputting the second text entry does not comprise speaking;
assigning a text-to-speech entry to said second text entry; and
playing back the one or more words and the text-to-speech entry in an order corresponding to a placement of the first and second entries within said document. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
displaying the document on a display screen; and
shading the text-to-speech entry.
-
-
14. The method of claim 9, wherein a cessation of the first text entry and a beginning of the second text entry is signaled by a non-alphanumeric character.
-
15. The method of claim 9, wherein the first and second text entries comprise pictographic characters.
-
16. The method of claim 15, wherein the pictographic characters are Kanji characters.
-
17. A computer configured for performing the method of claim 9.
-
18. The method of claim 9, wherein a shape of a letter within a word of the first text entry, the second text entry, or both, varies depending on a location of the letter within the word.
-
19. The method of claim 9, wherein the first and second text entries are read from right to left.
-
20. The method of claim 9, wherein the second text entry is inputted by one or more of (a) typing the second text entry into the document using a keyboard, (b) copying the second text entry into the document using a mouse, and (c) handwriting text which is converted to the second text entry using a handwriting recognition program module.
-
21. A computer-implemented method for providing audio playback of a text document, comprising the steps of:
-
selecting a text set comprising at least one word, wherein each word comprises at least one phoneme;
determining whether a user-dictated audio input corresponds to a first word of the text set;
in the event that a user-dictated audio input corresponds to the first word, playing the user-dictated audio input through an audio output device;
otherwise, determining whether one of a plurality of text-to-speech entries corresponds to the first word;
in the event that a text-to-speech entry corresponds to the first word, playing the text-to-speech entry through an audio output device;
otherwise, determining which of the plurality of text-to-speech entries corresponds to a phoneme of the first word; and
in response to determining which of the plurality of text-to-speech entries corresponds to the phoneme of the first word, playing the corresponding text-to-speech entry through an audio output device. - View Dependent Claims (22, 23, 24, 25)
the text set comprises a plurality of words;
the first word corresponds to a user-dictated audio input; and
a second word within the plurality of words corresponds to a text-to-speech entry.
-
-
23. The method of claim 22, further comprising:
playing back the user-dictated audio input and the text-to-speech entry in an order corresponding to a placement of the first and second words in the text set.
-
24. The method of claim 21, further comprising:
-
determining a plurality of words for which no corresponding user dictated audio input exists;
passing the plurality of words to a text-to-speech module; and
retrieving a text-to-speech entry for each of the plurality of words.
-
-
25. A computer configured for performing the method of claim 21.
-
26. A method for compiling and evaluating text within a document, said method comprising the steps of:
-
inputting dictated words into a document;
converting said dictated words into a first text set within said document by use of a voice recognition process;
storing said dictated words separately but linked to said first text set for later audio playback;
inputting non-audio text into said document as a second text set within said document, wherein said non-audio text is inputted by one or more of (a) typing the non-audio text into the document using a keyboard, (b) copying the non-audio text into the document using a mouse, and (c) handwriting text which is converted to the non-audio text using a handwriting recognition program module; and
playing back audio corresponding to said first and second text sets in an order corresponding to a placement of said first and second text sets within said document, wherein a first portion of said audio corresponding to said first text set is provided by playback of said stored dictated words, and a second portion of said audio corresponding to said second text set is provided by playback of a text-to-speech process. - View Dependent Claims (27, 28, 29, 30)
-
Specification