Method and system for detecting DGA-based malware
First Claim
1. A method for detecting a domain generation algorithm (DGA), comprising:
- obtaining, from an electronic database, a plurality of non-existent (NX) domain names comprising a top-level domain (TLD), a second-level domain (2LD), and a third-level domain (3LD);
clustering, utilizing a name-based clustering module, a portion of the plurality of NX domain names based on at least one of n-gram features (NGF), entropy-based features (EBF), and structural domain features (SDF);
wherein the TLD, 2LD, and 3LD are all utilized by the name-based clustering module;
clustering, utilizing a graph clustering module, another portion of the plurality of NX domain names based on groups of assets that queried the NX domain names;
associating, utilizing a daily clustering correlation module, one or more NX domain names from the name based clustering model with one or more NX domain names from the graph clustering model;
responsive to the daily clustering, associating, utilizing a temporal clustering correlation module, one or more NX domain names from different clusters based on a rolling window of two consecutive epochs; and
determining whether a DGA that generated the clustered NX domain is unknown.
12 Assignments
0 Petitions
Accused Products
Abstract
System and method for detecting a domain generation algorithm (DGA), comprising: performing processing associated with clustering, utilizing a name-based features clustering module accessing information from an electronic database of NX domain information, the randomly generated domain names based on the similarity in the make-up of the randomly generated domain names; performing processing associated with clustering, utilizing a graph clustering module, the randomly generated domain names based on the groups of assets that queried the randomly generated domain names; performing processing associated with determining, utilizing a daily clustering correlation module and a temporal clustering correlation module, which clustered randomly generated domain names are highly correlated in daily use and in time; and performing processing associated with determining the DGA that generated the clustered randomly generated domain names.
-
Citations
18 Claims
-
1. A method for detecting a domain generation algorithm (DGA), comprising:
-
obtaining, from an electronic database, a plurality of non-existent (NX) domain names comprising a top-level domain (TLD), a second-level domain (2LD), and a third-level domain (3LD); clustering, utilizing a name-based clustering module, a portion of the plurality of NX domain names based on at least one of n-gram features (NGF), entropy-based features (EBF), and structural domain features (SDF); wherein the TLD, 2LD, and 3LD are all utilized by the name-based clustering module; clustering, utilizing a graph clustering module, another portion of the plurality of NX domain names based on groups of assets that queried the NX domain names; associating, utilizing a daily clustering correlation module, one or more NX domain names from the name based clustering model with one or more NX domain names from the graph clustering model; responsive to the daily clustering, associating, utilizing a temporal clustering correlation module, one or more NX domain names from different clusters based on a rolling window of two consecutive epochs; and determining whether a DGA that generated the clustered NX domain is unknown. - View Dependent Claims (2, 3, 4, 5, 6, 13, 14, 15)
-
-
7. A system for detecting a domain generation algorithm (DGA), comprising:
-
a non-transitory device comprising a processor 2 obtain, form an electronic database, a plurality of non-existent (NX) domain names comprising a top-level domain (TLD), a second-level domain (2LD), and a third-level domain (3LD); cluster, utilizing a name-based clustering module, a portion of the plurality of NX domain names based on at least one of n-gram features (NGF), entropy-based features (EBF), and structural-domain features (SDF); wherein the TLD, 2LD, and 3LD are all utilized by the name-based clustering module; cluster, utilizing a graph clustering module, another portion of the plurality of NX domain names based on groups of assets that queried the NX domain names; associate, utilizing a daily clustering correlation module, one or more NX domain from the name based clustering model with one or more NX domain names from the graph clustering model; responsive to the daily clustering, associate, utilizing a temporal clustering correlation module, one or more NX domain names from different clusters based on a rolling window of two consecutive epochs; and determine whether a DGA that generated the clustered NX domain names is unknown. - View Dependent Claims (8, 9, 10, 11, 12, 16, 17, 18)
-
Specification