×

Building A Translation Lexicon From Comparable, Non-Parallel Corpora

  • US 20100042398A1
  • Filed: 10/08/2009
  • Published: 02/18/2010
  • Est. Priority Date: 03/26/2002
  • Status: Active Grant
First Claim
Patent Images

1. A method for building a translation lexicon from non-parallel corpora by a machine translation system, the method comprising:

  • identifying identically spelled words in a first corpus and a second corpus, the first corpus including words in a first language and the second corpus including words in a second language, wherein the first corpus and the second corpus are non-parallel and are accessed by the machine translation system;

    generating a seed lexicon by the machine translation system, the seed lexicon including identically spelled words; and

    expanding the seed lexicon by the machine translation system by identifying possible translations of words in the first and second corpora using one or more clues.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×