Discriminative syntactic word order model for machine translation
First Claim
1. A method comprising:
- forming a source dependency tree for a source sentence in a source language, the source dependency tree indicating the syntactic hierarchy of source words in the source sentence;
forming a target dependency tree indicating the syntactic hierarchy of target words in a target language that are translations of source words in the source sentence;
identifying a plurality of target word orders after forming the target dependency tree, where each target word order contains the words in the target dependency tree, wherein identifying a plurality of target word orders after forming the target dependency tree comprises restricting the word orders in the plurality of word orders to word orders that are projective with respect to the target dependency tree such that each word and the word'"'"'s descendants in the target dependency tree form a contiguous subsequence in the word order;
identifying N-best target word orders from the plurality of target word orders by determining a score for each target word order by combining a language model probability that provides a probability of the target words appearing in a surface form in that order and a local tree order model probability that provides a probability for positions of words assigned to child nodes relative to positions of words assigned to head nodes in the target dependency tree;
using a discriminatively trained word order model to identify a most likely target word order from the N-best target word orders, wherein for each target word order in the N-best target word orders, the discriminatively trained word order model uses features based on information in the source dependency tree and the target dependency tree and features based on the order of words in the target word order.
2 Assignments
0 Petitions
Accused Products
Abstract
A discriminatively trained word order model is used to identify a most likely word order from a set of word orders for target words translated from a source sentence. For each set of word orders, the discriminatively trained word order model uses features based on information in a source dependency tree and a target dependency tree and features based on the order of words in the word order. The discriminatively trained statistical model is trained by determining a translation metric for each of a set of N-best word orders for a set of target words. Each of the N-best word orders are projective with respect to a target dependency tree and the N-best word orders are selected using a combination of an n-gram language model and a local tree order model.
-
Citations
17 Claims
-
1. A method comprising:
-
forming a source dependency tree for a source sentence in a source language, the source dependency tree indicating the syntactic hierarchy of source words in the source sentence; forming a target dependency tree indicating the syntactic hierarchy of target words in a target language that are translations of source words in the source sentence; identifying a plurality of target word orders after forming the target dependency tree, where each target word order contains the words in the target dependency tree, wherein identifying a plurality of target word orders after forming the target dependency tree comprises restricting the word orders in the plurality of word orders to word orders that are projective with respect to the target dependency tree such that each word and the word'"'"'s descendants in the target dependency tree form a contiguous subsequence in the word order; identifying N-best target word orders from the plurality of target word orders by determining a score for each target word order by combining a language model probability that provides a probability of the target words appearing in a surface form in that order and a local tree order model probability that provides a probability for positions of words assigned to child nodes relative to positions of words assigned to head nodes in the target dependency tree; using a discriminatively trained word order model to identify a most likely target word order from the N-best target word orders, wherein for each target word order in the N-best target word orders, the discriminatively trained word order model uses features based on information in the source dependency tree and the target dependency tree and features based on the order of words in the target word order. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
receiving a target dependency tree comprising target words translated from source words of a source dependency tree, the target dependency tree and the source dependency tree providing hierarchical relationships between words; forming a plurality of word orders for the target words after receiving the target dependency tree by restricting the word orders in the plurality of word orders to those word orders that are projective with respect to the received target dependency tree such that each word and the word'"'"'s descendants in the target dependency tree form a contiguous subsequence in the word order; scoring the plurality of word orders for the target words to form word order scores, wherein scoring a word order in the plurality of word orders comprises; determining an n-gram language model probability for the word order; and determining a local tree order model probability that is based on information from the source dependency tree, the target dependency tree and the word order; using the word order scores to select a smaller set of word orders for the target words from the plurality of word orders for the target words; using the smaller set of word orders of the target words to discriminatively train a model for selecting orders of target words by; determining a translation metric score for each word order in the smaller set of word orders; selecting a word order from the smaller set of word orders based on the translation metric score; and discriminatively training parameters of the model to optimize an objective function computed for the selected order by minimizing a negative log-likelihood of the selected order, wherein the log-likelihood comprises the log of a score for the selected word order over a sum of scores for each word order in the smaller set of word orders; and storing the parameters for the model on a computer-readable storage medium for use in ordering words in translations. - View Dependent Claims (7, 8)
-
-
9. A computer-readable storage medium having computer-executable instructions for performing steps comprising:
-
receiving a target dependency tree for a target sentence having words ordered in a first word order, wherein the received target dependency tree is formed based on a machine translation from a source language to a target language; reordering words in the target sentence to identify a second word order through steps comprising; while limiting consideration of word orders to those word orders that are projective with respect to the received target dependency tree, identifying a first set of possible word orders; scoring each of the possible word orders in the set of possible word orders; selecting N-best scoring word orders from the set of possible word orders; determining likelihoods for each of the N-best scoring word orders using feature values derived from the received target dependency tree and the N-best scoring word orders; and selecting one of the word orders from the N-best scoring word orders based on the likelihoods. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
-
Specification