×

Data classification methods using machine learning techniques

  • US 7,958,067 B2
  • Filed: 05/23/2007
  • Issued: 06/07/2011
  • Est. Priority Date: 07/12/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method for classifying documents, comprising:

  • receiving at least one labeled seed document having a known confidence level of label assignment;

    receiving unlabeled documents;

    receiving at least one predetermined cost factor;

    training a transductive classifier through iterative calculation using the at least one predetermined cost factor, the at least one seed document, and the unlabeled documents, wherein for each iteration of the calculations the cost factor is adjusted as a function of an expected label value;

    after at least some of the iterations, storing confidence scores for the unlabeled documents; and

    outputting identifiers of the unlabeled documents having the highest confidence scores to at least one of a user, another system, and another process.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×