AUTOMATIC BOTNET SPAM SIGNATURE GENERATION
First Claim
1. A system for generating uniform resource locator (URL) signatures to identify botnet spam and membership, comprising:
- a URL preprocessor that extracts a plurality of URLs from a plurality of input emails and groups the input emails into a plurality of URL groups according to their corresponding domains;
a group selector that selects the URL groups in accordance with a predetermined feature; and
a regular expression generator that determines a signature representative of the URLs contained within a botnet spam.
2 Assignments
0 Petitions
Accused Products
Abstract
A framework may be used for generating URL signatures to identify botnet spam and membership. The framework may take a set of unlabeled emails as input that are grouped based on URLs contained within the emails. The framework may return a set of spam URL signatures and a list of corresponding botnet host IP addresses by analyzing the URLs within the emails that are contained within the groups. Each URL signature may be in the form of either a complete URL string or a URL regular expression. The signatures may be used to identify spam emails launched from botnets, while the knowledge of botnet host identities can help filter other spam emails also sent by them.
-
Citations
20 Claims
-
1. A system for generating uniform resource locator (URL) signatures to identify botnet spam and membership, comprising:
-
a URL preprocessor that extracts a plurality of URLs from a plurality of input emails and groups the input emails into a plurality of URL groups according to their corresponding domains; a group selector that selects the URL groups in accordance with a predetermined feature; and a regular expression generator that determines a signature representative of the URLs contained within a botnet spam. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method for generating uniform resource locator (URL) signatures to identify botnet spam and membership, comprising:
-
extracting a plurality of URLs from a plurality of received emails; grouping the emails into a plurality of groups according to a domain specified by the extracted URLs; selecting the groups in accordance with a sending time burstiness or a distribution of an internet protocol (IP) address space of the emails within the groups; and generating a signature representative of URLs contained within a botnet spam in accordance with the sending time burstiness or distribution of the IP address space to identify emails as being botnet spam. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A computer-implemented method for generating a spam signature to identify botnet spam and membership, comprising:
-
grouping a plurality of emails into a plurality of groups according to a domain specified by a plurality of uniform resource locators (URLs) within the emails; iteratively selecting the groups in accordance with a sending time burstiness or a distribution of an internet protocol (IP) address space of the emails within the groups; generating URL based signatures or regular expression based signatures for a set of URLs belonging to a same domain; and outputting the URL based signature and a regular expression based signature to a spam filter. - View Dependent Claims (19, 20)
-
Specification