×

System and method for clustering files and assigning a maliciousness property based on clustering

  • US 10,546,143 B1
  • Filed: 11/14/2017
  • Issued: 01/28/2020
  • Est. Priority Date: 08/10/2017
  • Status: Active Grant
First Claim
Patent Images

1. A system, comprising:

  • an interface configured to receive a file;

    a processor configured to;

    transform file contents using a space-filling curve;

    down-sample the transformed file contents to generate a sample locus;

    perform a hashing operation on the sample locus and assign a cluster identifier to the file based at least in part on a result of the hashing operation;

    in response to a determination that the cluster identifier is not present in a data store, determine a set of candidate nearest neighbors for the cluster identifier;

    for each candidate nearest neighbor included in the set of candidate nearest neighbors, determine a set of existing cluster identifiers present in the data store;

    for each existing cluster identifier included in the set of existing cluster identifiers, determine a set of member loci;

    determine an edit distance between the sample locus and each of the respective loci in the set of member loci; and

    in response to a determination that at least a first locus included the set of member loci is within a threshold edit distance of the sample locus, assign one or more properties to the file based at least in part on properties associated with first locus, wherein at least one property assigned to the file is an indicator of maliciousness; and

    a memory coupled to the processor and configured to provide the processor with instructions.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×