×

TRANS-LINGUAL REPRESENTATION OF TEXT DOCUMENTS

  • US 20100324883A1
  • Filed: 06/19/2009
  • Published: 12/23/2010
  • Est. Priority Date: 06/19/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method of creating a trans-lingual text representation comprising:

  • accepting first language data, wherein the first language data comprises a plurality of documents in a first language;

    accepting second language data, wherein the second language data comprises a plurality of documents in a second language, wherein each document in a second language is comparable to a corresponding document in the first language;

    creating a first document-term matrix from the first language data, comprising a plurality of rows, each of said row corresponding to one of a plurality of documents in a first language;

    creating a second document-term matrix from the second language data, comprising a plurality of rows, each of said rows corresponding to one of a plurality of documents in a second language;

    applying an algorithm to the first matrix and the second matrix to produce a translingual text representation, wherein the translingual text representation comprises a plurality of vectors, each vector corresponding to either one row in the first document-term matrix or one row in the second document-term matrix, wherein the algorithm;

    minimizes the distance between pairs of translingual text representation vectors which correspond to a document in a first language and a document in a second language that is comparable to the document in the first language; and

    ,maximizes the distance between pairs of translingual text representation vectors which do not correspond to a document in a first language and a document in a second language that is comparable to the document in the first language.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×