Method and system for encoding pronunciation prefix trees
First Claim
1. A method in a computer system for encoding a pronunciation prefix tree, the pronunciation prefix tree having a plurality of nodes, each non-root and non-leaf node representing a phoneme, each leaf node representing a word formed by the phonemes represented by the non-leaf nodes in a path from the root node to the leaf node, each leaf node having a probability, the method comprising:
- creating a tree node dictionary containing an indication of the phonemes that compose each word;
ordering child nodes of each non-leaf node of the pronunciation prefix tree based on the highest probability of descendent leaf nodes of the child node;
for each non-leaf node of the pronunciation prefix tree, setting the probability of the non-leaf node to a probability based on the probability of its child nodes;
for each node of the pronunciation prefix tree, setting a factor of the node to the probability of the node divided by the probability of a parent node of the node; and
generating an encoded pronunciation entry for each leaf node of the pronunciation prefix tree, the encoded pronunciation entry indicating the word represented by the leaf node and containing the factor of a nearest ancestor node with a factor less than 1.0.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer system for linearly encoding a pronunciation prefix tree. The pronunciation prefix tree has nodes such that each non-root and non-leaf node represents a phoneme and wherein each leaf node represents a word formed by the phonemes represented by the non-leaf nodes in a path from the root node to the leaf node. Each leaf node has a probability associated with the word of the leaf node. The computer system creates a tree node dictionary containing an indication of the phonemes that compose each word. The computer system then orders the child nodes of each non-leaf node based on the highest probability of descendent leaf nodes of the child node. Then, for each non-leaf node, the computer system sets the probability of the non-leaf node to a probability based on the probability of its child nodes, and for each node, sets a factor of the node to the probability of the node divided by the probability of the parent node of the node. Finally, the computer system generates an encoded pronunciation entry for each leaf node of the pronunciation prefix tree. Each encoded pronunciation entry indicates the word represented by the leaf node and contains the factor of a nearest ancestor node with a factor other than 1.0.
80 Citations
65 Claims
-
1. A method in a computer system for encoding a pronunciation prefix tree, the pronunciation prefix tree having a plurality of nodes, each non-root and non-leaf node representing a phoneme, each leaf node representing a word formed by the phonemes represented by the non-leaf nodes in a path from the root node to the leaf node, each leaf node having a probability, the method comprising:
-
creating a tree node dictionary containing an indication of the phonemes that compose each word; ordering child nodes of each non-leaf node of the pronunciation prefix tree based on the highest probability of descendent leaf nodes of the child node; for each non-leaf node of the pronunciation prefix tree, setting the probability of the non-leaf node to a probability based on the probability of its child nodes; for each node of the pronunciation prefix tree, setting a factor of the node to the probability of the node divided by the probability of a parent node of the node; and generating an encoded pronunciation entry for each leaf node of the pronunciation prefix tree, the encoded pronunciation entry indicating the word represented by the leaf node and containing the factor of a nearest ancestor node with a factor less than 1.0. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method in a computer system for linearly encoding a pronunciation prefix tree, the pronunciation prefix tree having a plurality of nodes, each non-root and non-leaf node representing a phoneme, each leaf node representing a word formed by the phonemes represented by the non-leaf nodes in a path from the root node to the leaf node, each leaf node having a probability, the method comprising:
-
setting a probability of each non-leaf node to a probability based on the probability of its child nodes; setting a factor of each node to the probability of the node divided by the probability of the parent node of the node; and generating an encoded pronunciation entry for each leaf node of the pronunciation prefix tree, the encoded pronunciation entry indicating the word represented by the leaf node and containing the factor of a nearest ancestor node with a factor other than a predefined factor wherein the pronunciation prefix tree can be regenerated from the encoded pronunciation entries and a list of the phonemes that compose each word. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method in a computer system for decoding a linearly encoded pronunciation prefix tree into a pronunciation prefix tree, the linearly encode pronunciation prefix tree having an entries, each entry corresponding to a word and having a factor of a nearest ancestor node with a factor less than 1.0 in the pronunciation prefix tree, the linear encoded pronunciation prefix tree having an associated listing of the phonemes of each word, the method comprising:
for each entry, selecting the entry; retrieving the list of the phonemes of the word of the selected entry; adding a node to the pronunciation prefix tree corresponding to each phoneme in the retrieved list for which a node has not already been added; and setting the probability in each added node to the same probability that is the factor of the entry times the probability of the closest common ancestor node of the added nodes. - View Dependent Claims (23, 24, 25, 26, 27, 28)
-
29. A method in a computer system for linearly encoding a tree, the tree having a plurality of nodes, the tree having a root node and leaf nodes, each leaf node having a value, the method comprising:
-
for each leaf node, generating a list of an identification of the nodes in a path from the root node to the leaf node; setting a value of each non-leaf node to a value based on the value of its child nodes; setting a factor of each node to the value of the node divided by the value of a parent node of the node; and generating an encoded entry for each leaf node of the tree, the encoded entry identifying the leaf node and containing the factor of a nearest ancestor node with a factor other than a predefined factor. - View Dependent Claims (30, 31, 32, 33, 34, 35)
-
-
36. A computer-readable medium containing instructions for causing a computer system to encode a pronunciation prefix tree, the pronunciation prefix tree having a plurality of nodes representing phonemes that compose words, each leaf node having a probability, by:
-
for each leaf node, generating a list of phonemes that compose the words; setting a probability of each non-leaf node to a probability based on the probability of its child nodes; setting a factor of each node to the probability of the node divided by the probability of a parent node of the node; and generating an encoded pronunciation entry for each leaf node of the pronunciation prefix tree, the encoded pronunciation entry indicating the word represented by the leaf node and containing the factor of a nearest ancestor node with a factor other than a predefined factor. - View Dependent Claims (37, 38, 39, 40, 41, 42, 43, 44, 45, 46)
-
-
47. A computer system for encoding a tree, the tree having a plurality of nodes, the tree having a root node and leaf nodes, each leaf node having a value, the computer system comprising:
-
a path listing with an identification of the nodes of each path from a root node to a leaf node; means for setting a factor of each node to a value of the node divided by a value of the parent node of the node; and means for generating an encoded entry for each leaf node of the tree, the encoded entry identifying the leaf node and containing the factor of a nearest ancestor node with a factor other than a predefined factor. - View Dependent Claims (48, 49, 50, 51, 52, 53)
-
-
54. A computer system for recognizing speech comprising:
-
a linear encoder for linearly encoding a pronunciation prefix tree; a phoneme recognizer for receiving speech to be recognized and identifying phonemes that compose the received speech; and recognizer for identifying the words which the identified phonemes correspond using probabilities encoded in the linearly encoded pronunciation prefix tree. - View Dependent Claims (55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65)
-
Specification