Providing text to speech from digital content on an electronic device
First Claim
Patent Images
1. A method for providing audio relating to digital content in an electronic device, comprising:
- receiving digital content comprising a plurality of words and a supplemental pronunciation database of specified pronunciations for a portion of the plurality of words;
determining supplemental pronunciation instructions for a word of the plurality of words based at least in part on the supplemental pronunciation database;
determining default pronunciation instructions for another word of the plurality of words based at least in part on default pronunciation instructions in a default pronunciation database accessible by the electronic device;
determining that specified voice information used for synthesizing speech in a specified voice is specified for one or more of the plurality of words, wherein default voice information is used for synthesizing speech in a default voice in the absence of specified voice information; and
synthesizing speech for the plurality of words using the supplemental pronunciation instructions, the default pronunciation instructions, and at least one of the specified voice or the default voice.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for providing text to speech from digital content in an electronic device is described. Digital content including a plurality of words and a pronunciation database is received. Pronunciation instructions are determined for the word using the digital content. Audio or speech is played for the word using the pronunciation instructions. As a result, the method provides text to speech on the electronic device based on the digital content.
-
Citations
24 Claims
-
1. A method for providing audio relating to digital content in an electronic device, comprising:
-
receiving digital content comprising a plurality of words and a supplemental pronunciation database of specified pronunciations for a portion of the plurality of words; determining supplemental pronunciation instructions for a word of the plurality of words based at least in part on the supplemental pronunciation database; determining default pronunciation instructions for another word of the plurality of words based at least in part on default pronunciation instructions in a default pronunciation database accessible by the electronic device; determining that specified voice information used for synthesizing speech in a specified voice is specified for one or more of the plurality of words, wherein default voice information is used for synthesizing speech in a default voice in the absence of specified voice information; and synthesizing speech for the plurality of words using the supplemental pronunciation instructions, the default pronunciation instructions, and at least one of the specified voice or the default voice. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An electronic device that is configured to provide audio relating to digital content, the electronic device comprising:
-
a default pronunciation database; and instructions stored in memory, the instructions being executable to; receive digital content comprising a plurality of words and a supplemental pronunciation database that provides pronunciations for one or more of the plurality of words, wherein the supplemental pronunciation database is used with the digital content received in a same data structure as the supplemental pronunciation database and not with other digital content; for a first word for which the supplemental pronunciation database includes pronunciation instructions, synthesize a first speech for the first word based at least in part on the pronunciation instructions in the supplemental pronunciation database; for a second word for which the supplemental pronunciation database lacks pronunciation instructions, synthesize a second speech for the second word based at least in part on pronunciation instructions in the default pronunciation database; for a third word for which a specified voice is specified, synthesize a third speech for the third word based at least in part on the specified voice; and for a fourth word for which a specified voice is not specified, synthesize a fourth speech for the fourth word based at least in part on a default voice. - View Dependent Claims (9, 10)
-
-
11. A server configured to enhance digital content, comprising:
-
a database of digital content, wherein the digital content comprises a digital content item having a plurality of words; a default pronunciation database comprising default pronunciation instructions for synthesizing speech; specified voice information for synthesizing speech based at least in part on a specified voice; a supplemental pronunciation database comprising pronunciation instructions for synthesizing speech for one or more of the plurality of words, wherein the pronunciation instructions are different from the default pronunciation instructions; and a digital content enhancement module configured to generate enhanced digital content by appending the supplemental pronunciation database and the specified voice information to the digital content in a same data structure, such that sending of the enhanced digital content to a computing device causes the computing device to; synthesize a first speech based at least in part on the supplemental pronunciation database for a first one of the one or more of the plurality of words which have pronunciations in the supplemental pronunciation database; synthesize a second speech based at least in part on a default pronunciation database for a second one of the one or more of the plurality of words which do not have pronunciations in the supplemental pronunciation database; synthesize a third speech based at least in part on the specified voice for a third one of the one or more of the plurality of words which are specified to be synthesized with the specified voice; and synthesize a fourth speech based at least in part on a default voice for a fourth one of the one or more of the plurality of words for which a voice is not specified. - View Dependent Claims (12)
-
-
13. A non-transitory computer-readable medium comprising executable instructions for:
-
receiving an electronic book comprising a plurality of words, a supplemental pronunciation database, and a specified voice; for a first word in the plurality of words that has pronunciation instructions included in the supplemental pronunciation database, synthesizing a first speech for the first word based at least in part on the pronunciation instructions from the supplemental pronunciation database; for a second word in the plurality of words that does not have pronunciation instructions included in the supplemental pronunciation database, synthesizing a second speech for the second word based at least in part on a default pronunciation database; for a third word in the plurality of words that is specified to be synthesized with the specified voice, synthesizing a third speech for the third word based at least in part on the specified voice; and for a fourth word in the plurality of words that is not specified to be synthesized with the specified voice, synthesizing a fourth speech for the fourth word based at least in part on a default voice. - View Dependent Claims (14, 15, 16)
-
-
17. A method for obtaining and rendering audio based on text in an electronic book (eBook), the method comprising:
-
sending, from an eBook reader device, a request to download the eBook; receiving, at the eBook reader device, the eBook, a supplemental pronunciation database, and specified voice information for synthesizing speech in a specified voice; synthesizing a first speech for a first portion of text in the eBook based at least in part on a pronunciation from the supplemental pronunciation database for portions of text which have pronunciations in the supplemental pronunciation database; synthesizing a second speech for a second portion of text in the eBook based at least in part on a pronunciation from a default pronunciation database for portions of text which do not have pronunciations in the supplemental pronunciation database; synthesizing a third speech for a third portion of text in the eBook based at least in part on the specified voice for portions of text which are specified to be synthesized with the specified voice; and synthesizing a fourth speech for a fourth portion of text based at least in part on a default voice for portions of text which do not have any specified voice. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification