Computer method and apparatus for text-to-speech synthesizer dictionary reduction
First Claim
1. A method for reducing the size of a dictionary used in a speech synthesis system having a set of rules for determining phonemes from graphemes, the dictionary containing a plurality of entries, each entry comprising a grapheme string and a corresponding, phoneme string, the method comprising the steps of:
- determining if a given entry in the dictionary can be fully matched by using rules of the rule set, and if so, indicating the entry to be deleted from the dictionary;
determining if the given entry is required in the dictionary in order to support other entries, and if so, indicating the given entry to be saved; and
aggregating the entries indicated as to be saved, to form a reduced dictionary therefrom;
wherein the given entry comprises a grapheme string and a corresponding phoneme string.
3 Assignments
0 Petitions
Accused Products
Abstract
A computerized method and apparatus for reducing the size of a dictionary used in a text-to-speech synthesis system are provided. In an initial phase, the method and apparatus determine if entries in the dictionary, each containing a grapheme string and a corresponding phoneme string, can be fully matched by using at least one rule set used to synthesize words to phonemic data. If the entry can be fully matched using rule processing alone, the entry is indicated to be deleted from the dictionary. In a second phase, the method and apparatus determine if the entry, considered as a root word entry, is required in the dictionary in order to support phoneme synthesis of other entries containing the root word entry, and if so, the root word entry is indicated to be saved in the dictionary. If the other entries containing the root word entry can have correct phonemic data generated from a combination of the root word entries phonemic data and phonemes generated from rule set processing, then the other entries are indicated to be deleted from the dictionary. After all words have been processed by phase one and/or phase two, the entries indicated to be saved are aggregated to form a reduced dictionary.
33 Citations
29 Claims
-
1. A method for reducing the size of a dictionary used in a speech synthesis system having a set of rules for determining phonemes from graphemes, the dictionary containing a plurality of entries, each entry comprising a grapheme string and a corresponding, phoneme string, the method comprising the steps of:
-
determining if a given entry in the dictionary can be fully matched by using rules of the rule set, and if so, indicating the entry to be deleted from the dictionary;
determining if the given entry is required in the dictionary in order to support other entries, and if so, indicating the given entry to be saved; and
aggregating the entries indicated as to be saved, to form a reduced dictionary therefrom;
wherein the given entry comprises a grapheme string and a corresponding phoneme string.
-
-
2. A method for reducing the size of a dictionary used in a speech synthesis system having a set of rules for determining phonemes from graphemes, the dictionary containing a plurality of entries, each entry comprising a grapheme string and a corresponding phoneme string, the method comprising the steps of:
-
for each entry in the dictionary, determining if the entry in the dictionary can be fully matched by using rules of the rule set, and if so, indicating the entry to be deleted from the dictionary; and
creating a reduced dictionary from the entries remaining after omitting any entries indicated as to be deleted;
wherein each entry comprises a grapheme string and a corresponding phoneme string. - View Dependent Claims (3, 4)
generating a rule-based phoneme string for the grapheme string of the entry using rules in the rule set; and
determining if the e-based phoneme string matches the corresponding phoneme string of the entry, and if so, indicating the entry to be deleted from the dictionary.
-
-
4. The method of claim 3 wherein the step of determining if an entry in the dictionary can be fully matched, is performed for each entry in the dictionary starting with a first entry.
-
5. A method for reducing the size of a dictionary used in a speech synthesis system having a set of rules for determining phonemes from graphemes, the dictionary containing a plurality of entries, each entry comprising a grapheme string and a corresponding phoneme string, the method comprising the steps of:
-
for each entry in the dictionary, determining if the entry in the dictionary can be fully matched by using rules of the rule set, and if so, indicating the entry to be deleted from the dictionary including the steps of;
generating a rule-based phoneme string for the gapheme string of the entry using rules in the rule set; and
determining if the rule-based phoneme string matches the corresponding phoneme string of the entry, and if so, indicating the entry to be deleted from the dictionary;
creating a reduced dictionary from the entries remaining after omitting any entries indicated as to be deleted;
providing an affix rule set for the speech synthesis system, the affix rule set for determining phonemes from beginning and ending graphemes of character strings; and
before generating a rule based phoneme string, checking if any affix rule from the affix rule set matches a portion of the grapheme string of the entry, and if so, skipping to a next entry in the dictionary for processing;
wherein the step of determining if an entry in the dictionary can be fully matched, is performed for each entry in the dictionary starting with a first entry. - View Dependent Claims (6, 7, 8)
-
-
9. A method for reducing the size of a dictionary used in a speech synthesis system, the dictionary containing a plurality of entries, each entry comprising a grapheme string and a corresponding phoneme string, the method comprising the steps of:
-
determining if a given entry is required in the dictionary in order to produce the phoneme string of another entry, and if so, indicating the given entry to be saved; and
creating a dictionary containing entries indicated to be saved; and
wherein the speech synthesis system includes an affix rule set containing affix rules for determining phonemes from beginning and ending graphemes of character strings, each affix rule having a grapheme portion and a corresponding phoneme portion; and
the step of determining if the given entry is required in the dictionary includes the steps of;
combining grapheme and phoneme strings of a root word entry in the dictionary with respective grapheme and phoneme portions of an affix rule of the affix rule set to form a grapheme combination and phoneme combination pair; and
determining if the grapheme combination and phoneme combination pair exists as a matching entry in the dictionary, and if so, indicating the root word entry to be saved in the dictionary and indicating the matching entry to be deleted. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
combining the grapheme string of the root word entry with the grapheme portion of the affix rule to form the grapheme combination; and
combining the phoneme string of the root word entry with the phoneme portion of the affix rule to form the phoneme combination.
-
-
13. The method of claim 12, wherein the step of determining if the grapheme combination and phoneme combination pair exists as a matching entry in the dictionary, includes the steps of:
-
determining if the grapheme combination exists as a matching grapheme string in an entry in the dictionary, and if so, obtaining the corresponding phoneme string as a matching phoneme string for the entry;
determining if the phoneme combination matches the matching phoneme string, and if so, indicating the root word entry to be saved in the dictionary and indicating the matching entry to be deleted in the dictionary.
-
-
14. The method of claim 13, wherein, before the step of determining if the phoneme combination matches the matching phoneme string, normalizing any lexical stress in the phoneme combination and the matching phoneme string.
-
15. The method of claim 11, further comprising the step of saving in a reduced dictionary the entries that have been indicated to be saved.
-
16. The method of claim 11, further comprising the step of deleting entries that have been indicated to be deleted from the dictionary.
-
17. The method of claim 11, wherein the entries in the dictionary are arranged according to length of grapheme string with the shortest grapheme string first.
-
18. The method of claim 11, wherein the steps of combining and determining if the grapheme combination and phoneme combination pair exists as a matching entry in the dictionary, are performed first with rules from the affix rule set for determining phonemes from beginning graphemes.
-
19. The method of claim 11, wherein the step of determining if the grapheme combination and phoneme combination pair exists as a matching entry in the dictionary includes the steps of:
-
determining if the grapheme combination exists as a matching grapheme string in an entry in the dictionary, and if so, obtaining the corresponding phoneme string as a matching phoneme string for the entry;
determining if the phoneme combination matches the matching phoneme string, and if so, indicating the root word entry to be saved in the dictionary and indicating the matching entry to be deleted in the dictionary.
-
-
20. The method of claim 19, wherein, before the step of determining if the phoneme combination matches the matching phoneme string is performed, normalizing any lexical stress in the phoneme combination and the matching phoneme string.
-
21. The method of claim 19, further comprising the step of saving in a reduced dictionary, entries that have been indicated to be saved.
-
22. A method for reducing the size of a dictionary used in a speech synthesis system having a set of rules for determining phonemes from gaphemes, the dictionary containing a plurality of entries, each entry comprising a grapheme string and a corresponding phoneme string, the method comprising the steps of:
-
determining if a given entry in the dictionary can be fully matched by using rules of the rule set, and if so, indicating the entry to be deleted from the dictionary including the steps of;
generating a rule-based phoneme string for the grapheme string of the entry using rules in the rule set; and
determining if the rule-based phoneme string matches the corresponding phoneme string of the entry, and if so, indicating the entry to be deleted from the dictionary;
determining if the given entry is required in the dictionary in order to support other entries, and if so, indicating the given entry to be saved; and
aggregating the entries indicated as to be saved, to form a reduced dictionary therefrom. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
providing in the speech synthesis system an affix rule set containing affix rules for determining phonemes from beginning and ending graphemes of character strings, each affix rule having a grapheme portion and a corresponding phoneme portion;
combining grapheme and phoneme strings of a root word entry from the dictionary with respective grapheme and phoneme portions of an affix rule of the affix rule set to form a grapheme combination and phoneme combination pair; and
determining if the grapheme combination and phoneme combination pair exists as a matching entry in the dictionary, and if so, indicating the root word entry to be saved in the dictionary, and, indicating the matching entry to be deleted from the dictionary.
-
-
24. The method of claim 23, wherein the steps of combining and determining if the grapheme combination and phoneme combination pair exists as a matching entry in the dictionary are performed respectively for the root entry with each affix rule in the affix rule set;
- and
the step of determining if an entry is required, is performed for each root word entry in the dictionary starting with a first root word entry.
- and
-
25. The method of claim 23 further including the step of:
before generating a rule based phoneme string, determining if any affix rule from the affix rule set matches a portion of the grapheme string of the entry, and if so, skipping to a next entry in the dictionary for processing.
-
26. The method of claim 23 further including the step of checking if the grapheme string of the entry is a homograph, and if so, skipping to a next entry in the dictionary for processing.
-
27. The method of claim 23, wherein the step of combining includes the steps of:
-
combining the grapheme string of the root word entry with the grapheme portion of the affix rule to form the grapheme combination; and
combining the phoneme string of the root word entry with the phoneme portion of the affix rule to form the phoneme combination.
-
-
28. The method of claim 27, wherein the step of determining if the grapheme combination and phoneme combination pair exists as a matching entry in the dictionary includes the steps of:
-
determining if the grapheme combination exists as a matching grapheme string in an entry in the dictionary, and if so, obtaining the corresponding phoneme string as a matching phoneme string for the entry;
determining if the phoneme combination matches the matching phoneme string, and if so, indicating the root word entry to be saved in the dictionary and indicating the matching entry to be deleted from the dictionary.
-
-
29. The method of claim 28, wherein, before the step of determining if the phoneme combination matches the matching phoneme string is performed, normalizing any lexical stress in the phoneme combination and the matching phoneme string.
Specification