×

Method of computer-based automatic extraction of translation pairs of words from a bilingual text

  • US 5,907,821 A
  • Filed: 11/04/1996
  • Issued: 05/25/1999
  • Est. Priority Date: 11/06/1995
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for automatic extraction of a translation pair of words which comprises a word of a first language and a word of a second language corresponding thereto, comprising steps executed by a computer, the steps including:

  • extracting a plurality of words of the first language occurring in a first text described in the first language, from first text data which represents said first text;

    extracting, in correspondence to each occurrent word of the first language, a set of a plurality of co-occurrent words of the first language for said each occurrent word of the first language, each co-occurrent word of the first language being a word which occurs in a neighborhood of at least one of a plurality of positions within said first text where said each occurrent word of the first language occurs, and which fulfills at the same time a first predetermined condition related to said each occurrent word of the first language, said extracting being done from said first text data;

    extracting a plurality of words of the second language occurring in a second text corresponding to the first text and described in the second language, from second text data which represents said second text;

    extracting, in correspondence to each occurrent word of the second language, a set of a plurality of co-occurrent words of the second language for said each occurrent word of the second language, each co-occurrent word of the second language being a word which occurs in a neighborhood of at least one of a plurality of positions within said second text where said each occurrent word of the second language occurs, and which fulfills at the same time a second predetermined condition related to said each occurrent word of the second language, said extracting being done from said second text data;

    calculating a correlation between each occurrent word of the first language and each occurrent word of the second language, said calculating being done based on said set of a plurality of co-occurrent words of the first language extracted in correspondence to said each occurrent word of the first language and said set of a plurality of co-occurrent words of the second language extracted in correspondence to said each occurrent word of the second language; and

    selecting, as a translation pair of words, at least one pair of words from a plurality of pairs of words, each pair of words comprising one of said plurality of occurrent words of the first language and one of said plurality of occurrent words of the second language, said at least one pair of words being a pair of words between which a correlation satisfies a predetermined condition related to a translation pair of words, said selecting being done based upon a plurality of pairs of words.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×