×

Static feature extraction from structured files

  • US 9,959,276 B2
  • Filed: 02/12/2016
  • Issued: 05/01/2018
  • Est. Priority Date: 01/31/2014
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • receiving and accessing a plurality of structured files;

    parsing each structured file to discover corresponding code and data regions and to extract a plurality of corresponding code start points;

    extracting, for each structured file, at least one feature from such structured a file by disassembling the code in the structured file using each of the plurality of corresponding code start points as a respective disassembly starting point and analyzing one or more code and data regions identified within the structured file, the extracting occurring statically while the structured file is not being executed, the features being at least one of (i) a first-order feature indicating whether a collection of import names is ordered lexicographically and being able to be derived into a higher-order feature, (ii) a checksum feature for a string of elements in the file compared to a checksum stored in a field in the file, or (iii) a Boolean feature that characterizes a set of timestamp fields from the file to represent whether or not the file relies upon various functionalities that did not exist at the time represented by the most recent time stamp;

    providing the extracted features from each of the plurality of structured files to a machine learning model, to determine classification of the features and place them into a malicious or benign category, wherein the provision of the extracted features from one of the plurality of structured files, reduces subsequent misclassification of extracted features from the next one of the plurality structured files by the machine learning model.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×