×

Building a translation lexicon from comparable, non-parallel corpora

  • US 8,234,106 B2
  • Filed: 10/08/2009
  • Issued: 07/31/2012
  • Est. Priority Date: 03/26/2002
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for building a translation lexicon from non-parallel corpora by a machine translation system, the method comprising:

  • identifying identically spelled words in a first corpus and a second corpus, the first corpus including words in a first language and the second corpus including words in a second language, wherein the first corpus and the second corpus are non-parallel and are accessed by the machine translation system;

    generating a seed lexicon by the machine translation system, the seed lexicon including identically spelled words; and

    expanding the seed lexicon by the machine translation system by identifying possible translations of words in the first and second corpora using one or more clues.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×