×

Detecting spam email using multiple spam classifiers

  • US 7,882,192 B2
  • Filed: 08/14/2009
  • Issued: 02/01/2011
  • Est. Priority Date: 01/04/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of detecting whether a first e-mail is undesirable, the method comprising:

  • inputting the first e-mail to each of a plurality of constituent spam classifiers;

    obtaining at least one score from each of the plurality of constituent spam classifiers indicating the degree to which the first e-mail is deemed spam;

    obtaining a combined spam score from a combined spam classifier that takes as input the at least one score from the plurality of constituent spam classifiers, the combined spam classifier being computed automatically in accordance with a false-positive vs. false-negative tradeoff; and

    identifying the first e-mail as an undesirable e-mail if the combined spam score indicates that the first e-mail is undesirable;

    wherein step of computing the combined spam classifier comprises;

    compiling a labeled e-mail corpus consisting of a plurality of e-mails that have been labeled according to the degree to which the plurality of e-mails are deemed to be spam;

    computing scores of the plurality of constituent spam classifiers on each e-mail in the labeled e-mail corpus;

    establishing a set of one or more sample false-positive vs. false-negative tradeoffs;

    analyzing, for each sample false-positive vs. false-negative tradeoff, the computed scores of the plurality of constituent spam classifiers on each e-mail in the labeled e-mail corpus to compute a set of combined spam classifiers, each of which best achieves a corresponding sample false-positive vs. false-negative tradeoff;

    selecting a false-positive vs. false-negative tradeoff; and

    computing from the false-positive vs. false-negative tradeoff, a set of sample false-positive vs. false-negative tradeoffs and a set of corresponding best combined classifiers a best combined classifier for the false-positive vs. false-negative tradeoff, and wherein the false-positive vs. false-negative tradeoffs are specified by penalty functions, and the combined spam classifier associated with a given penalty function is computed by an optimization procedure that yields the combined spam classifier for which the value of the given penalty function is minimal on the labeled e-mail corpus.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×