Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word
First Claim
1. An apparatus for generating at least one phonetic pronunciation for an input sequence of letters selected from a predetermined alphabet, comprising:
- a memory for storing a plurality of letter-only decision trees corresponding to said alphabet,said letter-only decision trees having internal nodes representing yes-no questions about a given letter and its neighboring letters in a given sequence;
said memory further storing a plurality of mixed decision trees corresponding to said alphabet,said mixed decision trees having a first plurality of internal nodes representing yes-no questions about a given letter and its neighboring letters in said given sequence and having a second plurality of internal nodes representing yes-no questions about a phoneme and its neighboring phonemes in said given sequence,said letter-only decision trees and said mixed decision trees further having leaf nodes representing probability data that associates said given letter with a plurality of phoneme pronunciations;
a phoneme sequence generator coupled to said letter-only decision tree for processing an input sequence of letters and generating a first set of phonetic pronunciations corresponding to said input sequence of letters;
a score estimator coupled to said mixed decision tree for processing said first set to generate a second set of scored phonetic pronunciations, the scored phonetic pronunciations representing at least one phonetic pronunciation of said input sequence.
1 Assignment
0 Petitions
Accused Products
Abstract
The mixed decision tree includes a network of yes-no questions about adjacent letters in a spelled word sequence and also about adjacent phonemes in the phoneme sequence corresponding to the spelled word sequence. Leaf nodes of the mixed decision tree provide information about which phonetic transcriptions are most probable. Using the mixed trees, scores are developed for each of a plurality of possible pronunciations, and these scores can be used to select the best pronunciation as well as to rank pronunciations in order of probability. The pronunciations generated by the system can be used in speech synthesis and speech recognition applications as well as lexicography applications.
292 Citations
13 Claims
-
1. An apparatus for generating at least one phonetic pronunciation for an input sequence of letters selected from a predetermined alphabet, comprising:
-
a memory for storing a plurality of letter-only decision trees corresponding to said alphabet, said letter-only decision trees having internal nodes representing yes-no questions about a given letter and its neighboring letters in a given sequence; said memory further storing a plurality of mixed decision trees corresponding to said alphabet, said mixed decision trees having a first plurality of internal nodes representing yes-no questions about a given letter and its neighboring letters in said given sequence and having a second plurality of internal nodes representing yes-no questions about a phoneme and its neighboring phonemes in said given sequence, said letter-only decision trees and said mixed decision trees further having leaf nodes representing probability data that associates said given letter with a plurality of phoneme pronunciations; a phoneme sequence generator coupled to said letter-only decision tree for processing an input sequence of letters and generating a first set of phonetic pronunciations corresponding to said input sequence of letters; a score estimator coupled to said mixed decision tree for processing said first set to generate a second set of scored phonetic pronunciations, the scored phonetic pronunciations representing at least one phonetic pronunciation of said input sequence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
Specification