Voice recording tool for creating database used in text to speech synthesis system
First Claim
1. A method for recording verbal expressions of a person implemented in a concatenative speech synthesis system, the method comprising the following steps of:
- designing a system with syllable-in-context constructs where all syllables of a language are represented, and if a syllable is bounded by vowel on either side, a phonetic context of the vowel is encoded;
creating a database of text phrases in which each text phrase is represented as a sequence of syllable-in-context constructs;
using a proprietary method to sort all of the text phrases based on numbers of new syllable-in-context constructs that each of the text phrases contains;
displaying selected phrases one at a time through a proprietary speech recording software program for a voice talent in order to produce speech data of the selected phrases; and
recording the verbal expression one text phrase at a time while checking a speech signal to be within a predetermined normal range so that it is not clipped or distorted at either side;
wherein the syllable includes a regular syllable having exactly one vowel preceded and/or followed by any number of consonants, and a super-syllable having two or more syllables with a vowel hiatus across a syllable boundary.
1 Assignment
0 Petitions
Accused Products
Abstract
A method records verbal expressions of a person for use in a vehicle navigation system. The vehicle navigation system has a database including a map and text describing street names and points of interest of the map. The method includes the steps of obtaining from the database text of a word having at least one syllable, analyzing the syllable with a greedy algorithm to construct at least one text phrase comprising each syllable, such that the number of phrases is substantially minimized, converting the text phrase to at least one corresponding phonetic symbol phrase, displaying to the person the phonetic symbol phrase, the person verbally expressing each phrase of the phonetic symbol phrase, and recording the verbal expression of each phrase of the phonetic symbol phrase.
311 Citations
16 Claims
-
1. A method for recording verbal expressions of a person implemented in a concatenative speech synthesis system, the method comprising the following steps of:
-
designing a system with syllable-in-context constructs where all syllables of a language are represented, and if a syllable is bounded by vowel on either side, a phonetic context of the vowel is encoded; creating a database of text phrases in which each text phrase is represented as a sequence of syllable-in-context constructs; using a proprietary method to sort all of the text phrases based on numbers of new syllable-in-context constructs that each of the text phrases contains; displaying selected phrases one at a time through a proprietary speech recording software program for a voice talent in order to produce speech data of the selected phrases; and recording the verbal expression one text phrase at a time while checking a speech signal to be within a predetermined normal range so that it is not clipped or distorted at either side; wherein the syllable includes a regular syllable having exactly one vowel preceded and/or followed by any number of consonants, and a super-syllable having two or more syllables with a vowel hiatus across a syllable boundary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for recording verbal expressions of a person implemented in a concatenative speech synthesis system for use in a vehicle navigation system having a database including a map and text describing street names and points of interest of the map, the method comprising the following steps of:
-
designing a system with syllable-in-context constructs where all syllables of a language are represented, and if a syllable is bounded by vowel on either side, a phonetic context of the vowel is encoded; creating a database of text phrases in which each text phrase is represented as a sequence of syllable-in-context constructs; using a proprietary method to sort all of the text phrases based on numbers of new syllable-in-context constructs that each of the text phrases contains; displaying selected phrases one at a time through a proprietary speech recording software program for a voice talent in order to produce speech data of the selected phrases; and recording the verbal expression one text phrase at a time while checking a speech signal to be within a predetermined normal range so that it is not clipped or distorted at either side; wherein the syllable includes a regular syllable having exactly one vowel preceded and/or followed by any number of consonants, and a super-syllable having two or more syllables with a vowel hiatus across a syllable boundary. - View Dependent Claims (14, 15, 16)
-
Specification