×

Semi-supervised training for statistical word alignment

  • US 8,433,556 B2
  • Filed: 11/02/2006
  • Issued: 04/30/2013
  • Est. Priority Date: 11/02/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method for aligning words in parallel segments, the method comprising:

  • calculating a first probability distribution, utilizing a processor and a memory, according to a model estimate of word alignments within a first corpus comprising word-level unaligned parallel segments, the model estimate comprising an N-best list of one or more sub-models;

    modifying the model estimate according to the first probability distribution;

    discriminatively re-ranking one or more sub-models associated with the modified model estimate according to word-level annotated parallel segments; and

    calculating a second probability distribution of the word alignments within the first corpus according to the re-ranked sub-models associated with the modified model estimate;

    wherein discriminatively re-ranking one or more sub-models within the modified model estimate according to manual alignments further comprises;

    adding manual alignments to hypothesized alignments within the first corpus;

    comparing the manual alignments to the hypothesized alignments; and

    weighting the one or more sub-models according to the comparison; and

    wherein the comparing of the manual alignments to the hypothesized alignments comprises;

    comparing an updated weighting factor for each sub-model derived using the first corpus to randomly generated weighting factors; and

    selecting one of the updated weighting factor and the randomly generated weighting factor that generates a least amount of error.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×