INTELLIGENT TEXT-TO-SPEECH CONVERSION
0 Assignments
0 Petitions
Accused Products
Abstract
Techniques for improved text-to-speech processing are disclosed. The improved text-to-speech processing can convert text from an electronic document into an audio output that includes speech associated with the text as well as audio contextual cues. One aspect provides audio contextual cues to the listener when outputting speech (spoken text) pertaining to a document. The audio contextual cues can be based on an analysis of a document prior to a text-to-speech conversion. Another aspect can produce an audio summary for a file. The audio summary for a document can thereafter be presented to a user so that the user can hear a summary of the document without having to process the document to produce its spoken text via text-to-speech conversion.
-
Citations
53 Claims
-
1-33. -33. (canceled)
-
34. A method for converting text to speech, the method comprising:
at an electronic device with a processor and memory storing one or more programs for execution by the processor; parsing a document to identify a plurality of elements in the document; associating a first element of the plurality of elements with a first markup tag and a second element of the plurality of elements with a second markup tag; creating an announcement comprising a spoken description of context for the first element; and based on the first markup tag and the second markup tag, generating audio that includes the announcement and a spoken form of text of the second element, wherein the announcement is spoken prior to the spoken form of the text of the second element. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43)
-
44. A non-transitory computer-readable storage medium comprising instructions, which when executed by an electronic device, causes the electronic device to:
-
parse a document to identify a plurality of elements in the document; associate a first element of the plurality of elements with a first markup tag and a second element of the plurality of elements with a second markup tag; create an announcement comprising a spoken description of context for the first element; and based on the first markup tag and the second markup tag, generate audio that includes the announcement and a spoken form of text of the second element, wherein the announcement is spoken prior to the spoken form of the text of the second element. - View Dependent Claims (45, 46, 47, 48)
-
-
49. An electronic device, comprising:
-
one or more processors; and memory storing one or more programs, the one or more programs including instructions, which when executed by the one or more processors, causes the one or more processors to; parse a document to identify a plurality of elements in the document; associate a first element of the plurality of elements with a first markup tag and a second element of the plurality of elements with a second markup tag; create an announcement comprising a spoken description of context for the first element; and based on the first markup tag and the second markup tag, generate audio that includes the announcement and a spoken form of text of the second element, wherein the announcement is spoken prior to the spoken form of the text of the second element. - View Dependent Claims (50, 51, 52, 53)
-
Specification