×

Data classification using machine learning techniques

  • US 8,719,197 B2
  • Filed: 04/19/2011
  • Issued: 05/06/2014
  • Est. Priority Date: 07/12/2006
  • Status: Active Grant
First Claim
Patent Images

1. A system for classifying documents, comprising:

  • a memory; and

    a processor in communication with the memory, the processor being configured to process at least some instructions stored in the memory,wherein the memory stores computer executable program code comprising instructions for;

    receiving at least one labeled seed document having a known confidence level of label assignment;

    receiving unlabeled documents;

    receiving at least one predetermined cost factor;

    training a transductive classifier through iterative calculation using the at least one predetermined cost factor, the at least one seed document, and the unlabeled documents, wherein for each iteration of the calculations the cost factor is adjusted as a function of an expected label value;

    after at least some of the iterations, storing confidence scores for the unlabeled documents; and

    outputting identifiers of the unlabeled documents having the highest confidence scores to at least one of a user, another system, and another process.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×