Method for recognizing spam email

US 7,475,118 B2
Filed: 02/03/2006
Issued: 01/06/2009
Est. Priority Date: 02/03/2006
Status: Expired due to Fees

First Claim

Patent Images

1. A computer-implemented method comprising:

creating a learned database for storing a plurality of email paths by;

training a starting set of sorted email messages comprising spam and non-spam messages;

storing a starting spam score for each IP address stored in the learned database, wherein spam scores indicate a likelihood that an email received is spam;

combining portions of the IP addresses as they are stored;

aggregating IP addresses, based on domain ownership;

updating the learned database by receiving votes from users receiving emails, wherein each vote indicates whether the user regards the email to be spam or non-spam;

after evaluating each address starting with the most recent, accumulating a weighted average, and giving more weight to exact database matches than to those that were obtained only from other nearby addresses;

receiving an email message comprising a plurality of packets, delivery-path information comprising an email message header comprising received lines, and at least one recipient for the email message;

analyzing the received lines in the email message header, comprising;

extracting from the received lines a list of IP addresses and mail domains through which the email purportedly passed;

comparing the IP addresses with the learned database of delivery paths comprising IP addresses along each delivery path;

determining a network path for the email using one or more elements of the delivery path information;

applying a credibility function to the network path followed by the email message, comprising;

considering each node in the network path separately;

determining a preliminary credibility for each node, comprising counting the frequency of messages of each classification that were previously sent by each node;

using that preliminary credibility, and the credibility of one or more other nodes in the path, to determine the credibility of that node by examining the nodes from most recent to earliest and assigning each node a credibility no better than that of the previously examined node;

wherein a node with insufficient history for an adequate count in the counting step is given low credibility;

applying a relationship function to the network path followed by the email message;

comparing the network path with a plurality of prior email paths;

determining a measure of similarity between the path of the email received and one or more of the plurality of prior email paths;

determining a spam score for the email message received, based on the measure of similarity;

detecting and eliminating fake information, and providing a score for the message as a whole; and

not forwarding the email message to the at least one recipient when the email message is determined to comprise spam.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method includes steps of receiving an email message comprising a plurality of packets and delivery-path information; determining a path for the email using the delivery-path information; comparing the path with a plurality of prior email paths; determining a measure of similarity between the path of the email received and one or more of the plurality of prior email paths; and determining a spam score for the email received, based on the measure of similarity. Other embodiments include a computer readable medium comprising computer code for performing the above function and an information processing system including a processor configured (i.e., hard-wired or programmed) to perform the method.

113 Citations

1 Claim

1. A computer-implemented method comprising:
- creating a learned database for storing a plurality of email paths by;
  
  training a starting set of sorted email messages comprising spam and non-spam messages;
  
  storing a starting spam score for each IP address stored in the learned database, wherein spam scores indicate a likelihood that an email received is spam;
  
  combining portions of the IP addresses as they are stored;
  
  aggregating IP addresses, based on domain ownership;
  
  updating the learned database by receiving votes from users receiving emails, wherein each vote indicates whether the user regards the email to be spam or non-spam;
  
  after evaluating each address starting with the most recent, accumulating a weighted average, and giving more weight to exact database matches than to those that were obtained only from other nearby addresses;
  
  receiving an email message comprising a plurality of packets, delivery-path information comprising an email message header comprising received lines, and at least one recipient for the email message;
  
  analyzing the received lines in the email message header, comprising;
  
  extracting from the received lines a list of IP addresses and mail domains through which the email purportedly passed;
  
  comparing the IP addresses with the learned database of delivery paths comprising IP addresses along each delivery path;
  
  determining a network path for the email using one or more elements of the delivery path information;
  
  applying a credibility function to the network path followed by the email message, comprising;
  
  considering each node in the network path separately;
  
  determining a preliminary credibility for each node, comprising counting the frequency of messages of each classification that were previously sent by each node;
  
  using that preliminary credibility, and the credibility of one or more other nodes in the path, to determine the credibility of that node by examining the nodes from most recent to earliest and assigning each node a credibility no better than that of the previously examined node;
  
  wherein a node with insufficient history for an adequate count in the counting step is given low credibility;
  
  applying a relationship function to the network path followed by the email message;
  
  comparing the network path with a plurality of prior email paths;
  
  determining a measure of similarity between the path of the email received and one or more of the plurality of prior email paths;
  
  determining a spam score for the email message received, based on the measure of similarity;
  
  detecting and eliminating fake information, and providing a score for the message as a whole; and
  
  not forwarding the email message to the at least one recipient when the email message is determined to comprise spam.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Trend Micro Inc.
Original Assignee
International Business Machines Corporation
Inventors
Rajan, Vadakkedathu Thomas, Wegman, Mark N., Leiba, Barry, Ossher, Joel, Segal, Richard
Primary Examiner(s)
Meky, Moustafa M

Application Number

US11/347,492
Publication Number

US 20070185960A1
Time in Patent Office

1,068 Days
Field of Search

709200-206, 709217-227
US Class Current

709/206
CPC Class Codes

G06Q 10/107 Computer-aided management o...

Method for recognizing spam email

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

113 Citations

1 Claim

Specification

Solutions

Use Cases

Quick Links

Method for recognizing spam email

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

113 Citations

1 Claim

Specification

Subscription Required

Solutions

Use Cases

Quick Links