×

Identifying undesired email messages having attachments

  • US 20050091321A1
  • Filed: 10/14/2003
  • Published: 04/28/2005
  • Est. Priority Date: 10/14/2003
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising the steps of:

  • (A) receiving an email message from a simple mail transfer protocol (SMTP) server, the email message comprising;

    (A1) a 32-bit string indicative of the length of the email message;

    (A2) a text body;

    (A3) an SMTP email address;

    (A4) a domain name corresponding to the SMTP email address;

    (A5) an attachment;

    (B) tokenizing the text body to generate tokens representative of words in the text;

    (C) tokenizing the SMTP email address to generate a token representative of the SMTP email address;

    (D) tokenizing the domain name to generate a token that is representative domain name;

    (E) tokenizing the attachment to generate a token that is representative of the attachment, the tokenizing step comprising the steps of;

    (E1) generating a 128-bit MD5 hash of the attachment;

    (E2) appending the 32-bit string to the generated MD5 hash to produce a 160-bit number; and

    (E3) UUencoding the 160-bit number to generate the token representative of the attachment;

    (F) determining a probability value for each of the generated tokens;

    (G) selecting a predefined number of interesting tokens, the interesting tokens being the generated tokens having the greatest non-neutral probability values;

    (H) performing a Bayesian analysis on the selected interesting tokens to generate a spam probability; and

    (I) categorizing the email message as a function of the generated spam probability.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×