×

Generation and use of trained file classifiers for malware detection

  • US 10,068,187 B1
  • Filed: 05/31/2017
  • Issued: 09/04/2018
  • Est. Priority Date: 05/01/2017
  • Status: Active Grant
First Claim
Patent Images

1. A computing device comprising:

  • a memory configured to store instructions to generate a trained file classifier; and

    a processor configured to execute the instructions from the memory to perform operations comprising;

    accessing information identifying multiple files and identifying classification data for the multiple files, wherein the classification data indicates, for a particular file of the multiple files, whether the particular file includes malware;

    generating a feature vector representing the particular file of the multiple files, the feature vector including;

    zero-skip n-gram data indicating occurrences of adjacent characters in printable characters representing the particular file;

    skip n-gram data indicating occurrences of non-adjacent characters in the printable characters representing the particular file; and

    n-gram data indicating occurrences of groups of entropy indicators in a set of entropy indicators derived from file entropy data for the particular file, each entropy indicator of the set of entropy indicators having a value representing entropy of a corresponding chunk of the particular file;

    generating the trained file classifier using the feature vector and the classification data as supervised training data; and

    transmitting the trained file classifier to a remote computing device via a network, wherein the trained file classifier is executable by the remote computing device to restrict access to a file or to restrict execution of the file based on a classification result generated by execution of the trained file classifier.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×