×

Generation and use of trained file classifiers for malware detection

  • US 10,062,038 B1
  • Filed: 05/31/2017
  • Issued: 08/28/2018
  • Est. Priority Date: 05/01/2017
  • Status: Active Grant
First Claim
Patent Images

1. A computing device comprising:

  • a memory configured to store instructions to generate a file classifier; and

    a processor configured to execute the instructions from the memory to perform operations comprising;

    accessing information identifying multiple files and identifying classification data for the multiple files, wherein the classification data indicates, for a particular file of the multiple files, whether the particular file includes malware;

    generating a sequence of entropy indicators for each of the multiple files, each entropy indicator of the sequence of entropy indicators for the particular file corresponding to a chunk of the particular file;

    generating n-gram vectors for the multiple files, wherein an n-gram vector for the particular file indicates occurrences of groups of entropy indicators in the sequence of entropy indicators for the particular file; and

    generating and storing a file classifier using the n-gram vectors and the classification data as supervised training data, wherein the supervised training data includes a plurality of n-gram vectors for each file, the plurality of n-gram vectors for at least one file including a zero-skip n-gram vector indicating occurrences of groups of adjacent entropy indicators, and including at least one skip n-gram vector indicating occurrences of groups of non-adjacent entropy indicators.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×