Group Based Spam Classification
First Claim
1. A method of classifying e-mails, the method comprising:
- receiving e-mails that include a header portion and a content portion;
storing the received e-mails;
analyzing the content portions of the stored e-mails to cluster the e-mails into multiple groups of substantially duplicate e-mails;
storing the multiple groups of substantially duplicate e-mails;
selecting one or more test e-mails from at least one of the stored groups;
determining a class for the one or more test e-mails; and
classifying at least one non-test e-mail in the at least one stored group based on the determined class of the one or more test e-mails.
8 Assignments
0 Petitions
Accused Products
Abstract
An e-mail filter is used to classify received e-mails so that some of the classes may be filtered, blocked, or marked. The e-mail filter may include a classifier that can classify an e-mail as belonging to a particular class and an e-mail grouper that can detect substantially similar, but possibly not identical, e-mails. The e-mail grouper determines groups of substantially similar e-mails in an incoming e-mail stream. For each group, the classifier determines whether one or more test e-mails from the group belongs to the particular class. The classifier then designates the class to which the other e-mails in the group belong based on the results for the test e-mails.
32 Citations
22 Claims
-
1. A method of classifying e-mails, the method comprising:
-
receiving e-mails that include a header portion and a content portion; storing the received e-mails; analyzing the content portions of the stored e-mails to cluster the e-mails into multiple groups of substantially duplicate e-mails; storing the multiple groups of substantially duplicate e-mails; selecting one or more test e-mails from at least one of the stored groups; determining a class for the one or more test e-mails; and classifying at least one non-test e-mail in the at least one stored group based on the determined class of the one or more test e-mails. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-usable medium having a computer program embodied thereon for classifying e-mails, the computer program comprising instructions for causing a computer to perform the following operations:
-
receive e-mails that include a header portion and a content portion; store the received e-mails; analyze the content portions of the stored e-mails to cluster the stored e-mails into multiple groups of substantially duplicate e-mails; store the multiple groups of substantially duplicate e-mails; select one or more test e-mails from at least one of the stored groups; determine a class for the one or more test e-mails; classify at least one non-test e-mail in the at least stored group based on the determined class of the one or more test e-mails. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification