×

Classifying documents using multiple classifiers

  • US 9,104,972 B1
  • Filed: 03/24/2014
  • Issued: 08/11/2015
  • Est. Priority Date: 03/13/2009
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method comprising:

  • computing multiple respective scores for each document in a collection of documents, one score from each of a plurality of D distinct classifiers, wherein each classifier computes a respective score representing a likelihood that the document has a property P, wherein each classifier has a respective lower threshold aj, wherein documents having a score less than aj are unlikely to have the property P, and wherein each classifier has a respective upper threshold bj, wherein documents having a score greater than bj are likely to have the property P;

    determining, for each respective classifier, a plurality of intervals between aj and bj for the classifier;

    determining, for each document in the collection of documents, a combination of intervals I11 to IDK according to which interval of the plurality of intervals each respective score for the document belongs;

    determining, for each combination of intervals Ij1 to IjK, whether any documents in the collection of documents have a corresponding combination of intervals I11 to IDK;

    selecting no more than M documents for each combination of intervals Ij1 to IjK for which at least one document in the collection has the corresponding combination of intervals; and

    training a multiple classifier model for the D distinct classifiers using each selected document.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×