Method for generating spelling-to-pronunciation decision tree
First Claim
1. A memory for storing spelling-to-pronunciation data for use in analyzing an input sequence, comprising:
- a decision tree data structure stored in said memory that defines a plurality of internal nodes and a plurality of leaf nodes, said internal nodes adapted for storing yes-no questions and said leaf nodes adapted for storing probability data;
a first plurality of said internal nodes being populated with letter questions about a given letter in an input sequence and its neighboring letters in said input sequence;
a second plurality of said internal nodes being populated with phoneme questions about a given phoneme in said input sequence and its neighboring phonemes in said input sequence;
said leaf nodes being populated with probability data that associates said given letter with a plurality of phoneme pronunciations such that said phoneme questions ultimately result in said phoneme pronunciations.
1 Assignment
0 Petitions
Accused Products
Abstract
Decision trees are used to store a series of yes-no questions that can be used to convert spelled-word letter sequences into pronunciations. Letter-only trees, having internal nodes populated with questions about letters in the input sequence, generate one or more pronunciations based on probability data stored in the leaf nodes of the tree. The pronunciations may then be improved by processing them using mixed trees which are populated with questions about letters in the sequence and also questions about phonemes associated with those letters. The mixed tree screens out pronunciations that would not occur in natural speech, thereby greatly improving the results of the letter-to-pronunciation transformation.
71 Citations
14 Claims
-
1. A memory for storing spelling-to-pronunciation data for use in analyzing an input sequence, comprising:
-
a decision tree data structure stored in said memory that defines a plurality of internal nodes and a plurality of leaf nodes, said internal nodes adapted for storing yes-no questions and said leaf nodes adapted for storing probability data;
a first plurality of said internal nodes being populated with letter questions about a given letter in an input sequence and its neighboring letters in said input sequence;
a second plurality of said internal nodes being populated with phoneme questions about a given phoneme in said input sequence and its neighboring phonemes in said input sequence;
said leaf nodes being populated with probability data that associates said given letter with a plurality of phoneme pronunciations such that said phoneme questions ultimately result in said phoneme pronunciations. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for processing spelling-to-pronunciation data, comprising the steps of:
-
providing a first set of yes-no questions about letters in an input sequence and their relationship to neighboring letters in said input sequence;
providing a second set of yes-no questions about phonemes in said input sequence and their relationship to neighboring phonemes in said input sequence;
providing a corpus of training data representing a plurality of different sets of pairs each pair containing a letter sequence and a phoneme sequence, said letter sequence selected from an alphabet;
using said first and second sets and said training data to generate decision trees for at least a portion of said alphabet, said decision trees each having a plurality of internal nodes and a plurality of leaf nodes;
populating said internal nodes with questions selected from said first and second sets; and
populating said leaf nodes with the probability data that associates said portion of said alphabet with a plurality of phoneme pronunciations based on said training data, such that said phoneme pronunciations result from internal nodes populated with questions selected from both said first and second sets. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
-
Specification