×

Robust representation of network traffic for detecting malware variations

  • US 10,187,412 B2
  • Filed: 11/19/2015
  • Issued: 01/22/2019
  • Est. Priority Date: 08/28/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • at a networking device, dividing network traffic records to create at least one group of network traffic records, the at least one group including network traffic records being associated with network communications between a computing device and a server for a predetermined period of time;

    generating a set of feature vectors, each feature vector of the set of features vectors representing one of the network traffic records of the network communications included in the at least one group of network traffic records, wherein each feature vector comprises a predefined set of features extracted from one of the network traffic records;

    computing a self-similarity matrix for each feature of the predefined set of features using all feature vectors generated for the at least one group, each self-similarity matrix being a representation of one feature of the predefined set of features that is invariant to an increase or a decrease of values of the one feature across all of the feature vectors generated for the at least one group of network traffic records, each self-similarity matrix including a plurality of elements in rows and columns, wherein an (i, j)-th element of a self-similarity matrix corresponds to a distance between a feature value of an i-th network traffic record and a feature value of a j-th network traffic record;

    transforming each self-similarity matrix into a corresponding histogram to form a set of histograms, each histogram being a representation of the one feature that is invariant to a number of network traffic records in the at least one group of network traffic records;

    generating a cumulative feature vector based on the set of histograms, the cumulative feature vector being a cumulative representation of the predefined set of features of all network traffic records included in the at least one group of network traffic records;

    training a classifier based on the cumulative feature vector to produce a trained classifier;

    classifying, by the trained classifier, the at least one group as being malicious; and

    identifying a malware network communication between the computing device and the server utilizing the at least one classified group,wherein the cumulative feature vector enables detection of variations and modifications of the malware network communication.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×