×

Method and apparatus for bilingual word alignment, method and apparatus for training bilingual word alignment model

  • US 7,827,027 B2
  • Filed: 02/23/2007
  • Issued: 11/02/2010
  • Est. Priority Date: 02/28/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for bilingual word alignment by a processor executing instructions, comprising:

  • training a bilingual word alignment model using a word-aligned labeled bilingual corpus;

    word-aligning a plurality of bilingual sentence pairs in an unlabeled bilingual corpus using said bilingual word alignment model;

    determining whether the word alignment of each of said plurality of bilingual sentence pairs is correct, and when the word alignment is correct, adding the bilingual sentence pair into the labeled bilingual corpus and removing the bilingual sentence pair from the unlabeled bilingual corpus;

    retraining the bilingual word alignment model using the expanded labeled bilingual corpus; and

    re-word-aligning the remaining bilingual sentence pairs in the unlabeled bilingual corpus using the retrained bilingual word alignment model,said step of training a bilingual word alignment model comprising;

    training a forward bilingual word alignment model using the word-aligned labeled bilingual corpus; and

    training a backward bilingual word alignment model using the word-aligned labeled bilingual corpus,said step of word-aligning a plurality of bilingual sentence pairs in an unlabeled bilingual corpus comprising;

    forward-word-aligning each of said plurality of bilingual sentence pairs using said forward bilingual word alignment model; and

    backward-word-aligning each of said plurality of bilingual sentence pairs using said backward bilingual word alignment model, andsaid step of determining whether the word alignment of each of said plurality of bilingual sentence pairs is correct comprising;

    calculating an intersection set between the forward-word-aligning result and the backward-word-aligning result of the bilingual sentence pair;

    calculating a union set between the forward-word-aligning result and the backward-word-aligning result of the bilingual sentence pair; and

    determining, when the ratio of an element number of said intersection set to an element number of said union set is greater than a predetermined threshold, the word alignment of said bilingual sentence pair is correct.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×