×

Internal malware data item clustering and analysis

  • US 9,344,447 B2
  • Filed: 09/15/2014
  • Issued: 05/17/2016
  • Est. Priority Date: 07/03/2014
  • Status: Active Grant
First Claim
Patent Images

1. A computer system for protecting a computer network from malware by providing for the efficient analysis of large amounts of malware-related data, the computer system comprising:

  • one or more computer readable storage devices configured to store;

    a plurality of computer executable instructions;

    one or more software modules including computer executable instructions, the one or more software modules including a cluster engine module, a user interface engine module and a workflow engine module;

    a plurality of data clustering strategies based on rules generated by the cluster engine module; and

    a plurality of data cluster types, each data cluster type of the plurality of data cluster types associated with a data clustering strategy;

    one or more cluster data sources configured to store;

    a plurality of data items including at least;

    file data items, each file data item associated with at least one suspected malware file; and

    malware-related data items associated with captured communications between an internal network and an external network, the malware-related data items including at least one of;

    external Internet Protocol addresses, external domains, external computing devices, internal Internet Protocol addresses, internal computing devices, users of particular computing devices, or organizational positions associated with users of particular computing devices; and

    one or more hardware computer processors in communication with the one or more computer readable storage devices and the one or more cluster data sources, and configured to execute the one or more software modules in order to cause the computer system to;

    designate, by the cluster engine module, one or more seeds by;

    accessing, from the one or more cluster data sources, the file data items;

    calculating, for each file data item of the file data items, at least one of a hash of the file data item or a hash of an executed file data item, wherein the executed file data item was generated by an execution of the file data item in a sandboxed environment; and

    identifying one or more file data items based at least in part on comparing the at least one hash of the file data item or the executed file data item with a malware threat list of hashes, and designating each of the identified one or more file data items as a seed;

    for each of the file data items designated as a seed;

    select, by the cluster engine module, a particular data clustering strategy;

    identify, by the cluster engine module, one or more malware-related data items determined to be associated with the designated file data item seed based at least on the particular data clustering strategy, wherein the particular data clustering strategy performs at least one of querying the one or more cluster data sources or scanning network traffic to determine at least one of;

    external Internet Protocol addresses associated with the designated file data item seed, external domains associated with the designated file data item seed, external computing devices associated with the designated file data item seed, internal Internet Protocol addresses associated with the designated file data item seed, internal computing devices associated with the designated file data item seed users of particular computing devices associated with the designated file data item seed, or organizational positions associated with the determined users of particular computing devices;

    generate, by the cluster engine module, a data item cluster based at least on the designated file data item seed, wherein generating the data item cluster comprises;

    adding the designated file data item seed to the data item cluster;

    identifying one or more of the network indicators that are associated with the seed;

    identifying one or more of the network-related data items associated with at least one of the identified one or more of the network indicators;

    adding, to the data item cluster, the identified one or more malware-related data items;

    identifying an additional one or more data items, including file data items and/or malware-related data items, associated with any data items of the data item cluster;

    adding, to the data item cluster, the additional one or more data items; and

    storing the one or more data item clusters,generating by the user interface engine module at least one human-readable conclusion associated with at least one generated data item cluster, wherein generating the at least one human-readable conclusion comprises;

    determining a particular data cluster type from the plurality of cluster types based at least on the particular data clustering strategy;

    identifying one or more human-readable templates comprising pre-generated text, wherein the human-readable templates are based at least on predefined associations between respective human-readable templates and data cluster types;

    automatically analyzing the data item cluster to generate summary data according to rules, scoring algorithms, or other criteria; and

    populating the identified one or more human-readable templates with data from the at least one generated data item cluster or summary data of the at least one generated data item cluster;

    cause presentation, by the user interface engine module, of the at least one generated data item cluster and the at least one human-readable conclusion including the populated pre-generated text data, in a user interface of a client computing device; and

    generate, by the workflow engine and the user interface engine, an interactive workflow process to allow the user to perform at least one of;

    select new seeds, operate on existing seeds, generate new data clusters, or regenerate existing clusters.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×