Method and System for Enhancing a Speech Database
First Claim
Patent Images
1. A method comprising:
- receiving text as part of a text-to-speech process;
selecting a speech segment associated with the text, wherein the speech segment is selected from a primary speech database which has been modified by;
identifying primary speech segments in the primary speech database which do not meet a need of the text-to-speech process, wherein the primary speech segments comprise one of half-phones, half-phonemes, demi-syllables, and polyphones;
identifying replacement speech segments which satisfy the need in a secondary speech database; and
enhancing the primary speech database by substituting, in the primary database, the primary speech segments with the replacement speech segments; and
generating speech corresponding to the text using the speech segment.
8 Assignments
0 Petitions
Accused Products
Abstract
A system, method and computer readable medium that enhances a speech database for speech synthesis is disclosed. The method may include labeling audio files in a primary speech database, identifying segments in the labeled audio files that have varying pronunciations based on language differences, identifying replacement segments in a secondary speech database, enhancing the primary speech database by substituting the identified secondary speech database segments for the corresponding identified segments in the primary speech database, and storing the enhanced primary speech database for use in speech synthesis.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving text as part of a text-to-speech process; selecting a speech segment associated with the text, wherein the speech segment is selected from a primary speech database which has been modified by; identifying primary speech segments in the primary speech database which do not meet a need of the text-to-speech process, wherein the primary speech segments comprise one of half-phones, half-phonemes, demi-syllables, and polyphones; identifying replacement speech segments which satisfy the need in a secondary speech database; and enhancing the primary speech database by substituting, in the primary database, the primary speech segments with the replacement speech segments; and generating speech corresponding to the text using the speech segment. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; receiving text as part of a text-to-speech process; selecting a speech segment associated with the text, wherein the speech segment is selected from a primary speech database which has been modified by; identifying primary speech segments in the primary speech database which do not meet a need of the text-to-speech process, wherein the primary speech segments comprise one of half-phones, half-phonemes, demi-syllables, and polyphones; identifying replacement speech segments which satisfy the need in a secondary speech database; and enhancing the primary speech database by substituting, in the primary database, the primary speech segments with the replacement speech segments; and generating speech corresponding to the text using the speech segment. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
receiving text as part of a text-to-speech process; selecting a speech segment associated with the text, wherein the speech segment is selected from a primary speech database which has been modified by; identifying primary speech segments in the primary speech database which do not meet a need of the text-to-speech process, wherein the primary speech segments comprise one of half-phones, half-phonemes, demi-syllables, and polyphones; identifying replacement speech segments which satisfy the need in a secondary speech database; and enhancing the primary speech database by substituting, in the primary database, the primary speech segments with the replacement speech segments; and generating speech corresponding to the text using the speech segment. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification