Method and system for recorded word concatenation
First Claim
1. A method of recording speech sounds used for synthesizing speech, the method comprising:
- receiving information identifying a particular domain, the domain having unique prosody characteristics and rhythm;
identifying words and tonal patterns associated with the particular domain;
designing a word script related to the particular domain by applying the identified words and tonal patterns;
recording speaker utterances of the designed word script; and
editing the recorded speaker utterances according to the particular domain tonal patterns.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system are provided for performing recorded word concatenation to create a natural sounding sequence of words, numbers, phrases, sounds, etc. for example. The method and system may include a tonal pattern identification unit that identifies tonal patterns, such as pitch accents, phrase accents and boundary tones, for utterances in a particular domain, such as telephone numbers, credit card numbers, the spelling of words, etc.; a script designer that designs a script for recording a string of words, numbers, sounds etc., based on an appropriate rhythm and pitch range in order to obtain natural prosody for utterances in the particular domain and with minimum coarticulation between concatenative units; a script recorder that records a speaker'"'"'s utterances of the domain strings; a recording editor that edits the recorded strings by marking the beginning and end of each word, number etc. in the string and including or inserting pauses according to the tonal patterns; and a concatenation unit that concatenates the edited recording into a smooth and natural sounding string of words, numbers, letters of the alphabet, etc., for audio output.
-
Citations
13 Claims
-
1. A method of recording speech sounds used for synthesizing speech, the method comprising:
-
receiving information identifying a particular domain, the domain having unique prosody characteristics and rhythm;
identifying words and tonal patterns associated with the particular domain;
designing a word script related to the particular domain by applying the identified words and tonal patterns;
recording speaker utterances of the designed word script; and
editing the recorded speaker utterances according to the particular domain tonal patterns. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of synthesizing speech using speech units recorded from a script designed for a particular domain having an identifiable tonal pattern and rhythm, the script providing natural prosody for utterances in the particular domain and designed to minimize coarticulation, the recorded speech units being edited according to tonal patterns associated with the particular domain, the method comprising:
-
concatenating the edited recorded speech units into a string of words associated with the particular domain; and
outputting the concatenated string of words as synthesized speech. - View Dependent Claims (10, 11, 12)
-
-
13. A method of generating synthetic speech, the method comprising:
-
receiving information identifying a particular domain, the particular domain having unique prosody characteristics and rhythm;
identifying words and tonal patterns associated with the particular domain;
designing a word script related to the particular domain by applying the identified words and tonal patterns;
recording speaker utterances of the designed word script;
editing the recorded speaker utterances into speech units according to the particular domain tonal pattern, rhythm and natural prosody; and
concatenating the speech units into a string of words as synthesized speech within the particular domain.
-
Specification