Detecting spam e-mail using similarity calculations
First Claim
1. A method for detecting undesirable e-mail, the method comprising:
- collecting a plurality of undesirable e-mails;
arranging the plurality of undesirable e-mails into a plurality of groups;
generating, for each group, at least one token, thereby producing a plurality of tokens for the plurality of undesirable e-mails;
receiving a first e-mail;
generating at least one token for the first e-mail;
causing a comparison of the at least one token for the first e-mail with at least one of the plurality of tokens for the plurality of undesirable e-mails; and
identifying the first e-mail as an undesirable e-mail if the at least one token for the first e-mail matches any of the plurality of tokens for the plurality of undesirable e-mails.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for detecting undesirable e-mails is disclosed. The method includes collecting a plurality of undesirable e-mails, arranging the plurality of undesirable e-mails into a plurality of groups and generating, for each group, at least one token, thereby producing a plurality of tokens for the plurality of undesirable e-mails. The method further includes receiving a first e-mail and generating at least one token for the first e-mail. The method further includes causing a comparison of the at least one token for the first e-mail with at least one of the plurality of tokens for the plurality of undesirable e-mails and identifying the first e-mail as an undesirable e-mail if the at least one token for the first e-mail matches any of the plurality of tokens for the plurality of undesirable e-mails.
75 Citations
40 Claims
-
1. A method for detecting undesirable e-mail, the method comprising:
-
collecting a plurality of undesirable e-mails;
arranging the plurality of undesirable e-mails into a plurality of groups;
generating, for each group, at least one token, thereby producing a plurality of tokens for the plurality of undesirable e-mails;
receiving a first e-mail;
generating at least one token for the first e-mail;
causing a comparison of the at least one token for the first e-mail with at least one of the plurality of tokens for the plurality of undesirable e-mails; and
identifying the first e-mail as an undesirable e-mail if the at least one token for the first e-mail matches any of the plurality of tokens for the plurality of undesirable e-mails. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. An information processing system for detecting undesirable e-mail, comprising:
-
a memory for collecting a plurality of undesirable e-mails;
a receiver for receiving a first e-mail; and
a processor configured for;
arranging the plurality of undesirable e-mails into a plurality of groups;
generating, for each group, at least one token, thereby producing a plurality of tokens for the plurality of undesirable e-mails;
generating at least one token for the first e-mail;
causing a comparison of the at least one token for the first e-mail with at least one of the plurality of tokens for the plurality of undesirable e-mails; and
identifying the first e-mail as an undesirable e-mail if the at least one token for the first e-mail matches any of the plurality of tokens for the plurality of undesirable e-mails. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
-
-
26. A computer readable medium including computer instructions for detecting undesirable e-mail, the computer instructions including instructions for:
-
collecting a plurality of undesirable e-mails;
arranging the plurality of undesirable e-mails into a plurality of groups;
generating, for each group, at least one token, thereby producing a plurality of tokens for the plurality of undesirable e-mails;
receiving a first e-mail;
generating at least one token for the first e-mail;
causing a comparison of the at least one token for the first e-mail with at least one of the plurality of tokens for the plurality of undesirable e-mails; and
identifying the first e-mail as an undesirable e-mail if the at least one token for the first e-mail matches any of the plurality of tokens for the plurality of undesirable e-mails.
-
-
27. A method for detecting undesirable e-mail, the method comprising:
-
collecting a plurality of desirable and undesirable e-mails;
generating at least one token for the plurality of desirable and undesirable e-mails, receiving a first e-mail;
generating at least one token for the first e-mail;
causing a comparison of the at least one token for the first e-mail with at least one of the plurality of tokens for the plurality of desirable or undesirable e-mails; and
identifying the first e-mail as an desirable or undesirable e-mail based on the result of the comparison between at least one token for the first e-mail with at least one of the plurality of tokens for the plurality of desirable or undesirable e-mails. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
-
40. A method for detecting undesirable e-mail, the method comprising:
-
collecting a plurality of undesirable e-mails;
generating at least one token for the plurality of undesirable e-mails, thereby producing a plurality of tokens for the plurality of undesirable e-mails;
generating a weight associated with each of the plurality of tokens, wherein a weight is based on token length;
receiving a first e-mail;
generating at least one token for the first e-mail;
causing a comparison of the at least one token for the first e-mail with at least one of the plurality of tokens for the plurality of undesirable e-mails; and
identifying the first e-mail as an undesirable e-mail if the at least one token for the first e-mail matches any of the plurality of tokens for the plurality of undesirable e-mails.
-
Specification