Method and system for enhancing a speech database
First Claim
Patent Images
1. A method comprising:
- identifying, as part of a text-to-speech process, a primary speech database associated with a single language;
identifying primary speech segments in the primary speech database which do not meet a need of the text-to-speech process, wherein the primary speech segments comprise at least one of half-phones, half-phonemes, demi-syllables, and polyphones;
identifying replacement speech segments which satisfy the need in a secondary speech database of the single language; and
enhancing the primary speech database by substituting, in the primary database, the primary speech segments with the replacement speech segments.
8 Assignments
0 Petitions
Accused Products
Abstract
A system, method and computer readable medium that enhances a speech database for speech synthesis is disclosed. The method may include labeling audio files in a primary speech database, identifying segments in the labeled audio files that have varying pronunciations based on language differences, identifying replacement segments in a secondary speech database, enhancing the primary speech database by substituting the identified secondary speech database segments for the corresponding identified segments in the primary speech database, and storing the enhanced primary speech database for use in speech synthesis.
85 Citations
20 Claims
-
1. A method comprising:
-
identifying, as part of a text-to-speech process, a primary speech database associated with a single language; identifying primary speech segments in the primary speech database which do not meet a need of the text-to-speech process, wherein the primary speech segments comprise at least one of half-phones, half-phonemes, demi-syllables, and polyphones; identifying replacement speech segments which satisfy the need in a secondary speech database of the single language; and enhancing the primary speech database by substituting, in the primary database, the primary speech segments with the replacement speech segments. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable storage medium having stored therein instructions which, when executed by a processor, cause the processor to perform a method comprising:
-
identifying, as part of a text-to-speech process, a primary speech database associated with a single language; identifying primary speech segments in the primary speech database which do not meet a need of the text-to-speech process, wherein the primary speech segments comprise at least one of half-phones, half-phonemes, demi-syllables, and polyphones; identifying replacement speech segments which satisfy the need in a secondary speech database of the single language; and enhancing the primary speech database by substituting, in the primary database, the primary speech segments with the replacement speech segments. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system comprising:
-
a processor; and a computer-readable medium having stored therein instructions which, when executed by the processor, cause the processor to perform a method comprising; identifying, as part of a text-to-speech process, a primary speech database associated with a single language; identifying primary speech segments in the primary speech database which do not meet a need of the text-to-speech process, wherein the primary speech segments comprise at least one of half-phones, half-phonemes, demi-syllables, and polyphones; identifying replacement speech segments which satisfy the need in a secondary speech database of the single language; and enhancing the primary speech database by substituting, in the primary database, the primary speech segments with the replacement speech segments. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification