Group based spam classification
First Claim
Patent Images
1. A method of classifying e-mails, the method comprising:
- receiving a plurality of e-mails that each include a respective header portion and content portion;
storing the received plurality of e-mails;
analyzing, using at least one processor, the content portions of the stored plurality of e-mails to cluster the plurality of e-mails into multiple groups of substantially duplicate e-mails;
storing the multiple groups of substantially duplicate e-mails;
selecting a test e-mail from a group of the stored multiple groups;
determining a class for the test email based on the content portion of the test e-mail; and
classifying a non-test e-mail in the group based on the determined class of the test e-mail.
8 Assignments
0 Petitions
Accused Products
Abstract
An e-mail filter is used to classify received e-mails so that some of the classes may be filtered, blocked, or marked. The e-mail filter may include a classifier that can classify an e-mail as belonging to a particular class and an e-mail grouper that can detect substantially similar, but possibly not identical, e-mails. The e-mail grouper determines groups of substantially similar e-mails in an incoming e-mail stream. For each group, the classifier determines whether one or more test e-mails from the group belongs to the particular class. The classifier then designates the class to which the other e-mails in the group belong based on the results for the test e-mails.
-
Citations
26 Claims
-
1. A method of classifying e-mails, the method comprising:
-
receiving a plurality of e-mails that each include a respective header portion and content portion; storing the received plurality of e-mails; analyzing, using at least one processor, the content portions of the stored plurality of e-mails to cluster the plurality of e-mails into multiple groups of substantially duplicate e-mails; storing the multiple groups of substantially duplicate e-mails; selecting a test e-mail from a group of the stored multiple groups; determining a class for the test email based on the content portion of the test e-mail; and classifying a non-test e-mail in the group based on the determined class of the test e-mail. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 23, 24)
-
-
12. A non-transitory computer-usable storage medium having a computer program embodied thereon for classifying e-mails, the computer program comprising instructions for causing at least one processor to perform the following operations:
-
receive a plurality of e-mails that each include a respective header portion and content portion; store the received plurality of e-mails; analyze the content portions of the stored plurality of e-mails to cluster the stored plurality of e-mails into multiple groups of substantially duplicate e-mails; store the multiple groups of substantially duplicate e-mails; select a test e-mail from a group of the stored multiple groups; determine a class for the test e-mail based on the content portion of the test e-mail; and classify a non-test e-mail in the group based on the determined class of the test e-mail. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 25, 26)
-
Specification