×

Methods and systems for alignment of parallel text corpora

  • US 9,047,275 B2
  • Filed: 05/04/2012
  • Issued: 06/02/2015
  • Est. Priority Date: 10/10/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method of aligning fragments of a first text in a first language with corresponding fragments of a second text, which is a translation of the first text into a second language, comprising:

  • preliminarily dividing the first and second texts into fragments;

    generating a hypothesis about correspondence between at least first fragment in the first text and at least second fragment in the second text;

    determining estimations reflecting correspondence between the first and the second fragments, wherein each estimation is based at least on a ratio between;

    (a) a number of words in at least one of the first segment or the second segment; and

    (b) a number of words in the first fragment that have a corresponding translation in the second fragment according to a normalized one-to-one dictionary;

    determining a degree of correspondence between the first and the second fragments based on the estimations, including adjusting the estimations by using weight coefficients selected on the basis of heuristics or training; and

    comparing the degree of correspondence to a predetermined threshold.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×