×

METHODS FOR AUTOMATIC GENERATION OF PARALLEL CORPORA

  • US 20150248401A1
  • Filed: 12/31/2014
  • Published: 09/03/2015
  • Est. Priority Date: 02/28/2014
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method comprising:

  • receiving sets of item listings in a first language and sets of item listings in a second language, each of the item listings in the sets of item listings comprising one or more descriptions and metadata;

    collecting the metadata from the sets of item listings and aligning the sets of item listings using the metadata;

    mapping the aligned sets of item listings from the first language to the second language for each of the sets of item listings;

    fetching the descriptions of the mapped aligned sets of item listings and measuring the structural similarity of the fetched descriptions of the mapped aligned sets of item listings to assess whether mapped aligned sets of item listings are likely to be translations of each other, andfor pairs of mapped aligned sets of item listings having structurally similar descriptions, forming the descriptions of the mapped aligned sets of item listings into respective sentences in the first language and in the second language.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×