Intelligent text-to-speech conversion
First Claim
1. A method for converting text to speech, the method comprising:
- at an electronic device with a processor and memory storing one or more programs for execution by the processor;
parsing a document to identify a plurality of text elements in the document to be converted to speech, wherein in the document, a first text element of the plurality of text elements is positioned before a second text element of the plurality of text elements;
determining, by the processor, an order in which the plurality of text elements are to be spoken, wherein the determined order comprises speaking the second text element before the first text element; and
converting the plurality of text elements to speech, wherein the speech is spoken in the determined order.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for improved text-to-speech processing are disclosed. The improved text-to-speech processing can convert text from an electronic document into an audio output that includes speech associated with the text as well as audio contextual cues. One aspect provides audio contextual cues to the listener when outputting speech (spoken text) pertaining to a document. The audio contextual cues can be based on an analysis of a document prior to a text-to-speech conversion. Another aspect can produce an audio summary for a file. The audio summary for a document can thereafter be presented to a user so that the user can hear a summary of the document without having to process the document to produce its spoken text via text-to-speech conversion.
244 Citations
25 Claims
-
1. A method for converting text to speech, the method comprising:
at an electronic device with a processor and memory storing one or more programs for execution by the processor; parsing a document to identify a plurality of text elements in the document to be converted to speech, wherein in the document, a first text element of the plurality of text elements is positioned before a second text element of the plurality of text elements; determining, by the processor, an order in which the plurality of text elements are to be spoken, wherein the determined order comprises speaking the second text element before the first text element; and converting the plurality of text elements to speech, wherein the speech is spoken in the determined order. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
8. A method for converting text to speech, the method comprising:
at an electronic device with a processor and memory storing one or more programs for execution by the processor; parsing a document to identify a subset of text to be converted to speech, the subset of text having a context; creating an announcement comprising a spoken description of the context; determining, by the processor, an order in which the announcement and a spoken form of the subset of text are to be spoken, wherein the determined order comprises speaking the announcement prior to the spoken form of the subset of text; and generating audio that includes the spoken form of the subset of text and the announcement, wherein the announcement is spoken prior to the spoken form of the subset of text. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
17. A non-transitory computer-readable storage medium comprising instructions for causing one or more processors to:
-
parsing a document to identify a plurality of text elements in the document to be converted to speech, wherein in the document, a first text element of the plurality of text elements is positioned before a second text element of the plurality of text elements; determining, by the one or more processors, an order in which the plurality of text elements are to be spoken, wherein the determined order comprises speaking the second text element before the first text element; and converting the plurality of text elements to speech, wherein the speech is spoken in the determined order.
-
-
18. An electronic device comprising:
-
one or more processors; memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for; parsing a document to identify a plurality of text elements in the document to be converted to speech, wherein in the document, a first text element of the plurality of text elements is positioned before a second text element of the plurality of text elements; determining, by the one or more processors, an order in which the plurality of text elements are to be spoken, wherein the determined order comprises speaking the second text element before the first text element; and converting the plurality of text elements to speech, wherein the speech is spoken in the determined order. - View Dependent Claims (19, 20, 21)
-
-
22. An electronic device comprising:
-
one or more processors; memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for; parsing a document to identify a subset of text to be converted to speech, the subset of text having a context; creating an announcement comprising a spoken description of the context; determining, by the one or more processors, an order in which the announcement and a spoken form of the subset of text are to be spoken, wherein the determined order comprises speaking the announcement prior to the spoken form of the subset of text; and generating audio that includes the spoken form of the subset of text and the announcement, wherein the announcement is spoken prior to the spoken form of the subset of text. - View Dependent Claims (23, 24, 25)
-
Specification