System and method for word-sense disambiguation by recursive partitioning
First Claim
1. A method of constructing a test for use in electronically disambiguating a homograph during a computer-based text-to-speech event, the method comprising:
- using at least one processor to construct a decision tree for determining a pronunciation label for the homograph in an input word string, the decision tree comprising at least first and second nodes, the first node being a parent of the second node, wherein the at least one processor is configured to construct the decision tree at least in part by;
accessing a first set of training samples, each of the training samples comprising a word string that contains the homograph and a pronunciation label indicating a correct pronunciation of the homograph in the word string;
applying a plurality of decision rules to the first set of training samples, each of the plurality of decision rules partitioning the first set of training samples into at least two subsets of the first set of training samples;
for each one of the plurality of decision rules, computing a corresponding measure of impurity indicative of an extent to which each of the at least two subsets formed by applying the one of the plurality of decision rules contains training samples associated with different pronunciation labels, wherein the one of the plurality of decision rules, when applied to word strings in the first set of training samples, determines whether at least one selected word indicator is present in the word strings, and wherein at least one training sample in the first set of training samples is retained for computing the measure of impurity corresponding to the one of the plurality of decision rules even if the at least one selected word indicator is absent in the word string of the at least one training sample; and
selecting, for the first node of the decision tree, a decision rule from the plurality of decision rules based at least in part on the measures of impurity computed for the plurality of decision rules.
9 Assignments
0 Petitions
Accused Products
Abstract
A device and related methods for word-sense disambiguation during a text-to-speech conversion are provided. The device, for use with a computer-based system capable of converting text data to synthesized speech, includes an identification module for identifying a homograph contained in the text data. The device also includes an assignment module for assigning a pronunciation to the homograph using a statistical test constructed from a recursive partitioning of training samples, each training sample being a word string containing the homograph. The recursive partitioning is based on determining for each training sample an order and a distance of each word indicator relative to the homograph in the training sample. An absence of one of the word indicators in a training sample is treated as equivalent to the absent word indicator being more than a predefined distance from the homograph.
-
Citations
24 Claims
-
1. A method of constructing a test for use in electronically disambiguating a homograph during a computer-based text-to-speech event, the method comprising:
using at least one processor to construct a decision tree for determining a pronunciation label for the homograph in an input word string, the decision tree comprising at least first and second nodes, the first node being a parent of the second node, wherein the at least one processor is configured to construct the decision tree at least in part by; accessing a first set of training samples, each of the training samples comprising a word string that contains the homograph and a pronunciation label indicating a correct pronunciation of the homograph in the word string; applying a plurality of decision rules to the first set of training samples, each of the plurality of decision rules partitioning the first set of training samples into at least two subsets of the first set of training samples; for each one of the plurality of decision rules, computing a corresponding measure of impurity indicative of an extent to which each of the at least two subsets formed by applying the one of the plurality of decision rules contains training samples associated with different pronunciation labels, wherein the one of the plurality of decision rules, when applied to word strings in the first set of training samples, determines whether at least one selected word indicator is present in the word strings, and wherein at least one training sample in the first set of training samples is retained for computing the measure of impurity corresponding to the one of the plurality of decision rules even if the at least one selected word indicator is absent in the word string of the at least one training sample; and selecting, for the first node of the decision tree, a decision rule from the plurality of decision rules based at least in part on the measures of impurity computed for the plurality of decision rules. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A system for constructing a test for use in electronically disambiguating a homograph during a computer-based text-to-speech event, the system comprising:
-
an input for receiving a plurality of training samples, each training sample comprising a word string containing the homograph and a pronunciation label indicating a correct pronunciation of the homograph in the word string; and at least one computer coupled to the input to receive the plurality of training samples, the at least one computer programmed to construct a decision tree for determining a pronunciation label for the homograph in an input word string, the decision tree comprising at least first and second nodes, the first node being a parent of the second node, wherein the at least one computer is programmed to construct the decision tree at least in part by; accessing a first set of training samples, each of the training samples comprising a word string that contains the homograph and a pronunciation label indicating a correct pronunciation of the homograph in the word string; applying a plurality of decision rules to the first set of training samples, each of the plurality of decision rules partitioning the first set of training samples into at least two subsets of the first set of training samples; for each one of the plurality of decision rules, computing a corresponding measure of impurity indicative of an extent to which each of the at least two subsets formed by applying the one of the plurality of decision rules contains training samples associated with different pronunciation labels, wherein the one of the plurality of decision rules, when applied to word strings in the first set of training samples, determines whether at least one selected word indicator is present in the word strings, and wherein at least one training sample in the first set of training samples is retained for computing the measure of impurity corresponding to the one of the plurality of decision rules even if the at least one selected word indicator is absent in the word string of the at least one training sample; and selecting, for the first node of the decision tree, a decision rule from the plurality of decision rules based at least in part on the measures of impurity computed for the plurality of decision rules. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. At least one machine readable memory, having stored thereon a computer program having a plurality of code sections executable by at least one machine for causing the at least one machine to perform a computer-implemented method for constructing a test for use in disambiguating a homograph during a computer-based text-to-speech event, the method comprising steps of:
using at least one processor to construct a decision tree for determining a pronunciation label for the homograph in an input word string, the decision tree comprising at least first and second nodes, the first node being a parent of the second node, wherein the at least one processor is configured to construct the decision tree at least in part by; accessing a first set of training samples, each of the training samples comprising a word string that contains the homograph and a pronunciation label indicating a correct pronunciation of the homograph in the word string; applying a plurality of decision rules to the first set of training samples, each of the plurality of decision rules partitioning the first set of training samples into at least two subsets of the first set of training samples; for each one of the plurality of decision rules, computing a corresponding measure of impurity indicative of an extent to which each of the at least two subsets formed by applying the one of the plurality of decision rules contains training samples associated with different pronunciation labels, wherein the one of the plurality of decision rules, when applied to word strings in the first set of training samples, determines whether at least one selected word indicator is present in the word strings, and wherein at least one training sample in the first set of training samples is retained for computing the measure of impurity corresponding to the one of the plurality of decision rules even if the at least one selected word indicator is absent in the word string of the at least one training sample; and selecting, for the first node of the decision tree, a decision rule from the plurality of decision rules based at least in part on the measures of impurity computed for the plurality of decision rules. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
Specification