Discriminative syntactic word order model for machine translation

US 8,452,585 B2
Filed: 04/02/2008
Issued: 05/28/2013
Est. Priority Date: 06/21/2007
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

forming a source dependency tree for a source sentence in a source language, the source dependency tree indicating the syntactic hierarchy of source words in the source sentence;

forming a target dependency tree indicating the syntactic hierarchy of target words in a target language that are translations of source words in the source sentence;

identifying a plurality of target word orders after forming the target dependency tree, where each target word order contains the words in the target dependency tree, wherein identifying a plurality of target word orders after forming the target dependency tree comprises restricting the word orders in the plurality of word orders to word orders that are projective with respect to the target dependency tree such that each word and the word'"'"'s descendants in the target dependency tree form a contiguous subsequence in the word order;

identifying N-best target word orders from the plurality of target word orders by determining a score for each target word order by combining a language model probability that provides a probability of the target words appearing in a surface form in that order and a local tree order model probability that provides a probability for positions of words assigned to child nodes relative to positions of words assigned to head nodes in the target dependency tree;

using a discriminatively trained word order model to identify a most likely target word order from the N-best target word orders, wherein for each target word order in the N-best target word orders, the discriminatively trained word order model uses features based on information in the source dependency tree and the target dependency tree and features based on the order of words in the target word order.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A discriminatively trained word order model is used to identify a most likely word order from a set of word orders for target words translated from a source sentence. For each set of word orders, the discriminatively trained word order model uses features based on information in a source dependency tree and a target dependency tree and features based on the order of words in the word order. The discriminatively trained statistical model is trained by determining a translation metric for each of a set of N-best word orders for a set of target words. Each of the N-best word orders are projective with respect to a target dependency tree and the N-best word orders are selected using a combination of an n-gram language model and a local tree order model.

Citations

17 Claims

1. A method comprising:
- forming a source dependency tree for a source sentence in a source language, the source dependency tree indicating the syntactic hierarchy of source words in the source sentence;
  
  forming a target dependency tree indicating the syntactic hierarchy of target words in a target language that are translations of source words in the source sentence;
  
  identifying a plurality of target word orders after forming the target dependency tree, where each target word order contains the words in the target dependency tree, wherein identifying a plurality of target word orders after forming the target dependency tree comprises restricting the word orders in the plurality of word orders to word orders that are projective with respect to the target dependency tree such that each word and the word'"'"'s descendants in the target dependency tree form a contiguous subsequence in the word order;
  
  identifying N-best target word orders from the plurality of target word orders by determining a score for each target word order by combining a language model probability that provides a probability of the target words appearing in a surface form in that order and a local tree order model probability that provides a probability for positions of words assigned to child nodes relative to positions of words assigned to head nodes in the target dependency tree;
  
  using a discriminatively trained word order model to identify a most likely target word order from the N-best target word orders, wherein for each target word order in the N-best target word orders, the discriminatively trained word order model uses features based on information in the source dependency tree and the target dependency tree and features based on the order of words in the target word order.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 wherein the features based on information in the source dependency tree and the target dependency tree use part of speech tags for words in the source dependency tree and part of speech tags for words in the target dependency tree.
  - 3. The method of claim 2 wherein the discriminatively trained word order model uses as features the language model probability of the target words appearing in a surface form in the target word order and a local tree order model probability that provides a probability for positions of words assigned to child nodes relative to positions of words assigned to head nodes in the target dependency tree.
  - 4. The method of claim 1 wherein the features based on the order of words in the target word order comprise features that indicate how words in the source sentence are positioned relative to each other where the words are translations of two contiguous target words in the target word order.
  - 5. The method of claim 1 wherein the discriminatively trained word order model is trained based on N-best target word orders identified by scoring each of a plurality of word orders based on a language model probability that provides a probability of the target words appearing in a surface form in that order and a local tree order model probability that provides a probability for positions of words assigned to child nodes relative to positions of words assigned to head nodes in a target dependency tree.

6. A method comprising:
- receiving a target dependency tree comprising target words translated from source words of a source dependency tree, the target dependency tree and the source dependency tree providing hierarchical relationships between words;
  
  forming a plurality of word orders for the target words after receiving the target dependency tree by restricting the word orders in the plurality of word orders to those word orders that are projective with respect to the received target dependency tree such that each word and the word'"'"'s descendants in the target dependency tree form a contiguous subsequence in the word order;
  
  scoring the plurality of word orders for the target words to form word order scores, wherein scoring a word order in the plurality of word orders comprises;
  
  determining an n-gram language model probability for the word order; and
  
  determining a local tree order model probability that is based on information from the source dependency tree, the target dependency tree and the word order;
  
  using the word order scores to select a smaller set of word orders for the target words from the plurality of word orders for the target words;
  
  using the smaller set of word orders of the target words to discriminatively train a model for selecting orders of target words by;
  
  determining a translation metric score for each word order in the smaller set of word orders;
  
  selecting a word order from the smaller set of word orders based on the translation metric score; and
  
  discriminatively training parameters of the model to optimize an objective function computed for the selected order by minimizing a negative log-likelihood of the selected order, wherein the log-likelihood comprises the log of a score for the selected word order over a sum of scores for each word order in the smaller set of word orders; and
  
  storing the parameters for the model on a computer-readable storage medium for use in ordering words in translations.
- View Dependent Claims (7, 8)
- - 7. The method of claim 6 wherein the target words are translated from the source words using machine translation.
  - 8. The method of claim 7 wherein some target words may be incorrect translations of source words.

9. A computer-readable storage medium having computer-executable instructions for performing steps comprising:
- receiving a target dependency tree for a target sentence having words ordered in a first word order, wherein the received target dependency tree is formed based on a machine translation from a source language to a target language;
  
  reordering words in the target sentence to identify a second word order through steps comprising;
  
  while limiting consideration of word orders to those word orders that are projective with respect to the received target dependency tree, identifying a first set of possible word orders;
  
  scoring each of the possible word orders in the set of possible word orders;
  
  selecting N-best scoring word orders from the set of possible word orders;
  
  determining likelihoods for each of the N-best scoring word orders using feature values derived from the received target dependency tree and the N-best scoring word orders; and
  
  selecting one of the word orders from the N-best scoring word orders based on the likelihoods.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
- - 10. The computer-readable storage medium of claim 9 further comprising identifying a source dependency tree for the source sentence.
  - 11. The computer-readable storage medium of claim 10 wherein scoring each of the possible word orders comprises using a combination of an n-gram language model and a local tree order model.
  - 12. The computer-readable storage medium of claim 11 wherein the local tree order model provides a score based on the source dependency tree, the target dependency tree and the word order of target words.
  - 13. The computer-readable storage medium of claim 12 wherein determining likelihoods for the N-best word orders comprises using feature weights that have been discriminatively trained based on word orders scored using a translation metric.
  - 14. The computer-readable storage medium of claim 13 wherein the feature values use values for part of speech features for target words.
  - 15. The computer-readable storage medium of claim 14 wherein the feature values use values for part of speech features for source words.
  - 16. The computer-readable storage medium of claim 15 wherein the feature values comprise the score from the combination of the n-gram language model and the local tree order model.
  - 17. The computer-readable storage medium of claim 16 wherein two contiguous target words in a word order of target words are respective translations of two source words in the source sentence and wherein the feature values comprise values for features that describe the position of the two source words relative to each other in the source sentence.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Toutanova, Kristina Nikolova, Chang, Pi-Chuan
Primary Examiner(s)
Saint Cyr, Leonard

Application Number

US12/061,313
Publication Number

US 20080319736A1
Time in Patent Office

1,882 Days
Field of Search

None
US Class Current

704/9
CPC Class Codes

G06F 40/44 Statistical methods, e.g. p...

Discriminative syntactic word order model for machine translation

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Discriminative syntactic word order model for machine translation

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links