×

Text categorization based on co-classification learning from multilingual corpora

  • US 8,438,009 B2
  • Filed: 10/21/2010
  • Issued: 05/07/2013
  • Est. Priority Date: 10/22/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for enhancing a performance of a first classifier implemented on a computing device used for classifying a first subset of documents written in a first language, the method comprising:

  • a) receiving, at the computing device, a second subset of documents written in a second language different than the first language, said second subset including substantially the same content as the first subset;

    b) running the first classifier over the first subset to generate a first classification;

    c) running a second classifier implemented on the computing device over the second subset to generate a second classification;

    d) reducing a training cost between the first and second classifications, including repeating steps b) and c) wherein each classifier updates its own classification in view of the classification generated by the other classifier until the training cost is set to a minimum;

    the reducing comprising applying at least one of a gradient based algorithm for reducing the training cost between classifications, and an analytical algorithm for finding an approximate solution that reduces classification losses to reduce the training cost between classifications; and

    e) outputting at least one of said first classification and said first classifier.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×