External malware data item clustering and analysis
First Claim
1. A computer system comprising:
- one or more computer readable storage devices configured to store;
a plurality of computer executable instructions;
a data clustering strategy; and
a plurality of data items including at least;
external domain data items; and
network-related data items associated with captured communications between an internal network and an external network, the network-related data items including at least one of;
external Internet Protocol addresses, external domains, identifiers corresponding to external computerized devices, internal Internet Protocol addresses, identifiers corresponding to internal computerized devices, identifiers corresponding to users of particular computerized devices, or organizational positions associated with users of particular computerized devices; and
one or more hardware computer processors in communication with the one or more computer readable storage devices and configured to execute the plurality of computer executable instructions in order to cause the computer system to;
scan one or more threat lists stored external to the internal network, each of the threat lists including information related to previously identified malware threats and information related to those previously identified malware threats including external domain data items;
identify one or more external domain data items included in the one or more threat lists, each of the one or more external domain data items being associated with a malicious domain;
designate each of the identified one or more external domain data items as a seed;
for each of the designated seeds, generate a data item cluster based on the data clustering strategy by at least;
adding the seed to the data item cluster;
identifying one or more of the network-related data items associated with the seed;
adding, to the data item cluster, the one or more identified network-related data items;
identifying an additional one or more data items, including external domain data items and/or network-related data items, associated with any data items of the data item cluster; and
adding, to the data item cluster, the additional one or more data items;
determine to regenerate a particular data item cluster;
regenerate the particular data item cluster by at least;
identifying new one or more data items, including external domain data items and/or network-related data items, associated with any data items of the particular data item cluster, wherein the new one or more data items were not present in the particular data item cluster as initially generated; and
adding, to the particular data item cluster, the new one or more data items;
access a plurality of data item clusters including at least one of the data item cluster or the particular data item cluster, wherein the plurality of data item clusters include data items associated with malware threats;
generate alert scores for at least some of the plurality of data item clusters according to one or more scoring strategies, wherein the alert scores indicate criticalities of the malware threats represented by the plurality of data item clusters; and
cause presentation, in a user interface, of at least a visualization including alerts for at least one of the plurality of data item clusters based on the alert scores, wherein the alerts visually indicate the criticalities of the malware threats represented by the plurality of data item clusters.
8 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the present disclosure relate to a data analysis system that may automatically generate memory-efficient clustered data structures, automatically analyze those clustered data structures, and provide results of the automated analysis in an optimized way to an analyst. The automated analysis of the clustered data structures (also referred to herein as data clusters) may include an automated application of various criteria or rules so as to generate a compact, human-readable analysis of the data clusters. The human-readable analyzes (also referred to herein as “summaries” or “conclusions”) of the data clusters may be organized into an interactive user interface so as to enable an analyst to quickly navigate among information associated with various data clusters and efficiently evaluate those data clusters in the context of, for example, a fraud investigation. Embodiments of the present disclosure also relate to automated scoring of the clustered data structures.
-
Citations
22 Claims
-
1. A computer system comprising:
-
one or more computer readable storage devices configured to store; a plurality of computer executable instructions; a data clustering strategy; and a plurality of data items including at least; external domain data items; and network-related data items associated with captured communications between an internal network and an external network, the network-related data items including at least one of;
external Internet Protocol addresses, external domains, identifiers corresponding to external computerized devices, internal Internet Protocol addresses, identifiers corresponding to internal computerized devices, identifiers corresponding to users of particular computerized devices, or organizational positions associated with users of particular computerized devices; andone or more hardware computer processors in communication with the one or more computer readable storage devices and configured to execute the plurality of computer executable instructions in order to cause the computer system to; scan one or more threat lists stored external to the internal network, each of the threat lists including information related to previously identified malware threats and information related to those previously identified malware threats including external domain data items; identify one or more external domain data items included in the one or more threat lists, each of the one or more external domain data items being associated with a malicious domain; designate each of the identified one or more external domain data items as a seed; for each of the designated seeds, generate a data item cluster based on the data clustering strategy by at least; adding the seed to the data item cluster; identifying one or more of the network-related data items associated with the seed; adding, to the data item cluster, the one or more identified network-related data items; identifying an additional one or more data items, including external domain data items and/or network-related data items, associated with any data items of the data item cluster; and adding, to the data item cluster, the additional one or more data items; determine to regenerate a particular data item cluster; regenerate the particular data item cluster by at least; identifying new one or more data items, including external domain data items and/or network-related data items, associated with any data items of the particular data item cluster, wherein the new one or more data items were not present in the particular data item cluster as initially generated; and adding, to the particular data item cluster, the new one or more data items; access a plurality of data item clusters including at least one of the data item cluster or the particular data item cluster, wherein the plurality of data item clusters include data items associated with malware threats; generate alert scores for at least some of the plurality of data item clusters according to one or more scoring strategies, wherein the alert scores indicate criticalities of the malware threats represented by the plurality of data item clusters; and cause presentation, in a user interface, of at least a visualization including alerts for at least one of the plurality of data item clusters based on the alert scores, wherein the alerts visually indicate the criticalities of the malware threats represented by the plurality of data item clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification