×

Learning framework for online applications

  • US 20090187987A1
  • Filed: 01/23/2008
  • Published: 07/23/2009
  • Est. Priority Date: 01/23/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for detecting spam messages,comprising:

  • determining a first stage probability of whether a received message is a spam message, wherein the first stage probability is determined by evaluating the received message in relation to a subset of test messages, wherein each subset test message in the subset of test messages was previously identified as either valid or spam;

    receiving an indication that a first stage classifier is unsure, based on the first stage probability, as to whether the received message is a spam message;

    determining that the first stage probability is greater than a lower limit for combining probabilities and is less than an upper limit for combining probabilities, wherein the lower limit for combining probabilities indicates a probability value below which the first stage probability will not be combined with a second stage probability to determine whether the received message is a spam message, and wherein the upper limit for combining probabilities indicates a probability value above which the received message is marked as a spam message without combining the first stage probability with the second stage probability;

    determining a second stage probability of whether the received message is a spam message, wherein the second stage probability is determined by evaluating the received message in relation to a subset-specific master set of test messages, which includes the subset of test messages, wherein each subset-specific master set test message in the subset-specific master set of test messages was previously identified as either valid or spam;

    computing a combined probability based on the first stage probability and the second stage probability;

    determining that the combined probability is greater than a threshold probability at which a threshold classification ratio is highest, wherein the classification ratio comprises a ratio of correctly identified spam messages over incorrectly identified spam messages.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×