System and method for malware detection using multidimensional feature clustering
First Claim
Patent Images
1. A method, comprising:
- specifying, by at least one hardware processor, multiple features, which are present in communication transactions conducted between computers in a computer network and which are indicative of whether the transactions are exchanged with a malicious software;
representing, by the at least one hardware processor, a plurality of malware transactions by respective elements in a multi-dimensional space, whose dimensions are spanned respectively by the features, so as to form plurality of clusters of the elements, wherein each transaction is represented by a respective tuple in the multi-dimensional space and different families of malware transactions correspond to different clusters of the plurality of clusters;
receiving, by a at least one hardware interface operatively coupled to the at least one hardware processor, a new input communication transaction conducted between computers in the computer network; and
identifying, by the at least one hardware processor, whether the new input communication transaction is malicious by at least;
representing, by the at least one hardware processor, the new input transaction as a new element tuple in the multi-dimensional space;
measuring, by the at least one hardware processor, respective distance metrics between the new element of the multi-dimensional space and each cluster of the plurality of clusters; and
evaluating, by the at least one hardware processor, a criterion with respect to the distance metrics, wherein evaluating the criterion with respect to the distance metrics comprises;
defining a classification criterion that identifies hybrid malware comprising different code sections taken from at least two of the different malware families associated with at least two different clusters of the plurality of clusters, and applying the defined criterion to the measured respective distance metrics between the new element of the multi-dimensional space and the at least two different clusters.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for malware detection techniques, which detect malware by identifying the Command and Control (C&C) communication between the malware and the remote host, and distinguish between communication transactions that carry C&C communication and transactions of innocent traffic. The fine-granularity features are examined, which are present in the transactions and are indicative of whether the transactions are exchanged with malware. A feature comprises an aggregated statistical property of one or more features of the transactions, such as average, sum median or variance, or of any suitable function or transformation of the features.
-
Citations
19 Claims
-
1. A method, comprising:
-
specifying, by at least one hardware processor, multiple features, which are present in communication transactions conducted between computers in a computer network and which are indicative of whether the transactions are exchanged with a malicious software; representing, by the at least one hardware processor, a plurality of malware transactions by respective elements in a multi-dimensional space, whose dimensions are spanned respectively by the features, so as to form plurality of clusters of the elements, wherein each transaction is represented by a respective tuple in the multi-dimensional space and different families of malware transactions correspond to different clusters of the plurality of clusters; receiving, by a at least one hardware interface operatively coupled to the at least one hardware processor, a new input communication transaction conducted between computers in the computer network; and identifying, by the at least one hardware processor, whether the new input communication transaction is malicious by at least; representing, by the at least one hardware processor, the new input transaction as a new element tuple in the multi-dimensional space; measuring, by the at least one hardware processor, respective distance metrics between the new element of the multi-dimensional space and each cluster of the plurality of clusters; and evaluating, by the at least one hardware processor, a criterion with respect to the distance metrics, wherein evaluating the criterion with respect to the distance metrics comprises;
defining a classification criterion that identifies hybrid malware comprising different code sections taken from at least two of the different malware families associated with at least two different clusters of the plurality of clusters, and applying the defined criterion to the measured respective distance metrics between the new element of the multi-dimensional space and the at least two different clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. Apparatus, comprising:
-
an interface, which is configured to receive communication transactions held in a computer network; and a hardware processor operatively coupled to the hardware interface, which is configured to hold a specification of multiple features, which are present in the communication transactions conducted between computers and which are indicative of whether the transactions are exchanged with a malicious software, to represent a plurality of malware transactions by respective elements in a multi-dimensional space, whose dimensions are spanned respectively by the features, so as to form plurality of clusters of the elements, wherein each transaction is represented by a respective tuple in the multi-dimensional space and different families of malware transactions correspond to different clusters of the plurality of clusters, and upon the at least one hardware interface receiving a new input communication transaction conducted between computers in the computer network, the at least one processor is configured to identify whether the new input communication transaction is malicious by at least; representing the new input transaction as a new element tuple in the multi-dimensional space; measuring respective distance metrics between the new element of the multi-dimensional space and each cluster of the plurality of clusters, and evaluating a criterion with respect to the distance metrics, wherein evaluating the criterion with respect to the distance metrics comprises;
defining a classification criterion that identifies hybrid malware comprising different code sections taken from at least two of the different malware families associated with at least two different clusters of the plurality of clusters, and applying the defined criterion to the measured respective distance metrics between the new element of the multi-dimensional space and the at least two different clusters. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification