Defining atom units between phone and syllable for TTS systems
First Claim
Patent Images
1. A method of developing a unit inventory for use by a text to speech system, comprising:
- identifying a list of phones for a target language;
receiving a lexicon containing phonetic transcriptions of a plurality of words having a plurality of syllables;
identifying a set of common multi-phone atom units for the lexicon by;
decomposing each syllable into a plurality of slices;
identifying non-common slices within the plurality of slices; and
decomposing the non-common slices according to predetermined set of rules;
adding the set of common multi-phone atom units to the unit inventory for the target language; and
wherein if the predetermined rules are unable to decompose the non-common slice, then;
adding the slice to the unit inventory.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for identifying common multiphone units to add to a unit inventory for a text-to-speech generator is disclosed. The common multiphone units are units that are larger than a phone, but smaller than a syllable. The method slices each syllable into a plurality of slices. These slices are then sorted and the frequency of each slice is determined. Those slices whose frequencies exceed a threshold are added to the unit inventory. The remaining slices are decomposed according to a predetermined set of rules to determine if they contain slices that should be added to the unit inventory.
227 Citations
13 Claims
-
1. A method of developing a unit inventory for use by a text to speech system, comprising:
-
identifying a list of phones for a target language; receiving a lexicon containing phonetic transcriptions of a plurality of words having a plurality of syllables; identifying a set of common multi-phone atom units for the lexicon by; decomposing each syllable into a plurality of slices; identifying non-common slices within the plurality of slices; and decomposing the non-common slices according to predetermined set of rules; adding the set of common multi-phone atom units to the unit inventory for the target language; and wherein if the predetermined rules are unable to decompose the non-common slice, then; adding the slice to the unit inventory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus for generating speech from text, comprising:
-
a unit inventory for storing a set of phoneme based atom units for at least one Target speaker, said set of phoneme based atom units being a plurality of different sizes and including only units limited to sizes greater than a phone but less than a syllable; a text analyzer for obtaining a string of phonetic symbols representative of a text to be converted to speech; and a concatenation module for selecting stored phoneme-based atom units to generate speech corresponding to the text, wherein the set of atom units comprises atom units that are determined to be common multi-phonal units for the target language; wherein the set of atom units includes atom units that are not common to the target language, but were unable to be decomposed according to a predetermined set of rules to match an entry already in the unit inventory. - View Dependent Claims (11, 12)
-
-
13. A unit inventory for use in text-to-speech generation, comprising:
-
a set of monophone units for a target language; a set of atom units sized between a phone and a syllable, for the target language; wherein the set of atom units comprises atom units that are determined to be common multiphonal units for the target language; wherein the set of atom units includes atom units that are not common to the target language, but were unable to be decomposed according to a predetermined set of rules to match an entry already in the unit inventory.
-
Specification