×

Systems and methods for providing a spam database and identifying spam communications

  • US 9,407,463 B2
  • Filed: 07/11/2011
  • Issued: 08/02/2016
  • Est. Priority Date: 07/11/2011
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented system of identifying an incoming e-mail as a spam e-mail, the system comprising:

  • at least one processor;

    a spam database which stores a plurality of known spam e-mails;

    a hardware server which performs offline processing, the offline processing comprising;

    accessing a known spam e-mail from the spam database;

    creating a first set of tokens from the known spam e-mail;

    calculating a first total as a number of tokens in first set of tokens;

    storing the first set of tokens and the first total;

    storing a third count for each known spam e-mail stored in the spam database, wherein the third count represents a number of times the incoming e-mail was identified as spam based on an easy signature computed using the first set of tokens and the first total corresponding to the known spam e-mail;

    computing an average count between the first count and a third count based on a minimum of the first count and the third count, the third count being a count of the unique token in the third set of tokens for a predetermined time period; and

    removing the known spam e-mail from the spam database when the average count is less than a predetermined threshold;

    a client which performs online processing, the online processing comprising;

    receiving the incoming e-mail;

    creating a second set of tokens from the incoming e-mail;

    calculating a second total as a number of tokens in the second set of tokens;

    accessing the first set of tokens and the first total corresponding to one of the plurality of known spam e-mails in the spam database;

    determining a number of common tokens based on a minimum of a first count and a second count, the first count being a count of each unique token in the first set of tokens and the second count being a count of the each unique token in the second set of tokens;

    computing an easy signature as a ratio of the number of common tokens and the sum of the first total and the second total; and

    designating the incoming e-mail as spam when the easy signature exceeds a predetermined threshold; and

    wherein when the easy signature does not exceed the predetermined thresholdthe server determines whether there are additional known spam e-mails in the spam database; and

    the client designates the incoming e-mail as not spam when there are no additional known spam e-mails in the spam database.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×