×

Method for recognizing spam email

  • US 7,475,118 B2
  • Filed: 02/03/2006
  • Issued: 01/06/2009
  • Est. Priority Date: 02/03/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method comprising:

  • creating a learned database for storing a plurality of email paths by;

    training a starting set of sorted email messages comprising spam and non-spam messages;

    storing a starting spam score for each IP address stored in the learned database, wherein spam scores indicate a likelihood that an email received is spam;

    combining portions of the IP addresses as they are stored;

    aggregating IP addresses, based on domain ownership;

    updating the learned database by receiving votes from users receiving emails, wherein each vote indicates whether the user regards the email to be spam or non-spam;

    after evaluating each address starting with the most recent, accumulating a weighted average, and giving more weight to exact database matches than to those that were obtained only from other nearby addresses;

    receiving an email message comprising a plurality of packets, delivery-path information comprising an email message header comprising received lines, and at least one recipient for the email message;

    analyzing the received lines in the email message header, comprising;

    extracting from the received lines a list of IP addresses and mail domains through which the email purportedly passed;

    comparing the IP addresses with the learned database of delivery paths comprising IP addresses along each delivery path;

    determining a network path for the email using one or more elements of the delivery path information;

    applying a credibility function to the network path followed by the email message, comprising;

    considering each node in the network path separately;

    determining a preliminary credibility for each node, comprising counting the frequency of messages of each classification that were previously sent by each node;

    using that preliminary credibility, and the credibility of one or more other nodes in the path, to determine the credibility of that node by examining the nodes from most recent to earliest and assigning each node a credibility no better than that of the previously examined node;

    wherein a node with insufficient history for an adequate count in the counting step is given low credibility;

    applying a relationship function to the network path followed by the email message;

    comparing the network path with a plurality of prior email paths;

    determining a measure of similarity between the path of the email received and one or more of the plurality of prior email paths;

    determining a spam score for the email message received, based on the measure of similarity;

    detecting and eliminating fake information, and providing a score for the message as a whole; and

    not forwarding the email message to the at least one recipient when the email message is determined to comprise spam.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×