Malware data item analysis
First Claim
1. A computer system comprising:
- one or more computer readable storage devices configured to store;
a plurality of computer executable instructions;
a plurality of file data items and submission data items, each submission data item associated with at least one file data item, each file data item comprising a suspected malware data item, each submission data item further including indications of at least;
a filename of an associated file data item that was submitted,a date the associated file data item was submitted, andan identifier of the person who submitted the associated file data item; and
a graph comprising nodes and edges, each of the nodes representing at least one of a file data item, a submission data item, an analysis data item, or another type of data item, each of the edges indicating an association between two of the nodes; and
one or more hardware computer processors in communication with the one or more computer readable storage devices and configured to execute the plurality of computer executable instructions in order to cause the computer system to automatically;
in response to receiving a first file data item;
determine whether the received first file data item was previously received by comparing the received first file data item to the plurality of file data items; and
generate a first submission data item;
in response to determining that the first file data item was not previously received;
initiate an analysis of the first file data item, wherein the analysis of the first file data item generates analysis information items, wherein initiating the analysis of the first file data item comprises;
initiating an internal analysis of the first file data item including at least calculation of a hash of the file data item; and
initiating an external analysis of the first file data item by one or more third party analysis systems;
associate the analysis information items with the first file data item; and
associate the first submission data item with the first file data item;
in response to receiving a second file data item;
determine whether the received second file data item was previously received by comparing the received second file data item to the plurality of file data items; and
generate a second submission data item;
in response to determining that the second file data item matches the first data item that was previously received, associate the second submission data item with the first file data item that was previously received; and
generate a user interface including one or more user selectable portions presenting various of the analysis information items associated with the first file data item, the user interface useable by an analyst to determine one or more characteristics of the first file data item, the one or more user selectable portions including a first selectable element, the first selectable element configured to cause, in response to an analyst input selecting the first selectable element, a generation of a graphical visualization including at least;
a first graphical representation of a first node representing the first file data item,a second graphical representation of a second node representing the first submission data item,a third graphical representation of an edge connecting the first and second graphical representations and representing the association between the first file data item and the first submission data item,a fourth graphical representation of a third node representing the second submission data item, anda fifth graphical representation of a second edge connecting the first and fourth graphical representations and representing the association between the first file data item and the second submission data item.
8 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the present disclosure relate to a data analysis system that may automatically analyze a suspected malware file, or group of files. Automatic analysis of the suspected malware file(s) may include one or more automatic analysis techniques. Automatic analysis of may include production and gathering of various items of information related to the suspected malware file(s) including, for example, calculated hashes, file properties, academic analysis information, file execution information, third-party analysis information, and/or the like. The analysis information may be automatically associated with the suspected malware file(s), and a user interface may be generated in which the various analysis information items are presented to a human analyst such that the analyst may quickly and efficiently evaluate the suspected malware file(s). For example, the analyst may quickly determine one or more characteristics of the suspected malware file(s), whether or not the file(s) is malware, and/or a threat level of the file(s).
-
Citations
17 Claims
-
1. A computer system comprising:
-
one or more computer readable storage devices configured to store; a plurality of computer executable instructions; a plurality of file data items and submission data items, each submission data item associated with at least one file data item, each file data item comprising a suspected malware data item, each submission data item further including indications of at least; a filename of an associated file data item that was submitted, a date the associated file data item was submitted, and an identifier of the person who submitted the associated file data item; and a graph comprising nodes and edges, each of the nodes representing at least one of a file data item, a submission data item, an analysis data item, or another type of data item, each of the edges indicating an association between two of the nodes; and one or more hardware computer processors in communication with the one or more computer readable storage devices and configured to execute the plurality of computer executable instructions in order to cause the computer system to automatically; in response to receiving a first file data item; determine whether the received first file data item was previously received by comparing the received first file data item to the plurality of file data items; and generate a first submission data item; in response to determining that the first file data item was not previously received; initiate an analysis of the first file data item, wherein the analysis of the first file data item generates analysis information items, wherein initiating the analysis of the first file data item comprises; initiating an internal analysis of the first file data item including at least calculation of a hash of the file data item; and initiating an external analysis of the first file data item by one or more third party analysis systems; associate the analysis information items with the first file data item; and associate the first submission data item with the first file data item; in response to receiving a second file data item; determine whether the received second file data item was previously received by comparing the received second file data item to the plurality of file data items; and generate a second submission data item; in response to determining that the second file data item matches the first data item that was previously received, associate the second submission data item with the first file data item that was previously received; and generate a user interface including one or more user selectable portions presenting various of the analysis information items associated with the first file data item, the user interface useable by an analyst to determine one or more characteristics of the first file data item, the one or more user selectable portions including a first selectable element, the first selectable element configured to cause, in response to an analyst input selecting the first selectable element, a generation of a graphical visualization including at least; a first graphical representation of a first node representing the first file data item, a second graphical representation of a second node representing the first submission data item, a third graphical representation of an edge connecting the first and second graphical representations and representing the association between the first file data item and the first submission data item, a fourth graphical representation of a third node representing the second submission data item, and a fifth graphical representation of a second edge connecting the first and fourth graphical representations and representing the association between the first file data item and the second submission data item. - View Dependent Claims (2, 3, 4, 5, 6, 7, 15)
-
-
8. A computer-implemented method comprising:
-
storing on one or more computer readable storage devices; a plurality of computer executable instructions; a plurality of file data items and submission data items, each submission data item associated with at least one file data item, each file data item comprising a suspected malware data item, each submission data item further including indications of at least; a filename of an associated file data item that was submitted, a date the associated file data item was submitted, and an identifier of the person who submitted the associated file data item; and a graph comprising nodes and edges, each of the nodes representing at least one of a file data item, a submission data item, an analysis data item, or another type of data item, each of the edges indicating an association between two of the nodes; in response to receiving a first file data item; determining, by one or more hardware computer devices configured with specific computer executable instructions, whether the received first file data item was previously received by comparing the received first file data item to the plurality of file data items; and generating, by the one or more hardware computer devices, a first submission data item; and in response to determining that the first file data item was not previously received; initiating, by the one or more hardware computer devices, an analysis of the first file data item, wherein the analysis of the first file data item generates analysis information items, wherein the initiating analysis of the first file data item comprises initiating an internal analysis of the first file data item including at least calculation of a hash of the file data item; associating, by the one or more hardware computer devices, the analysis information items with the first file data item; and associating, by the one or more hardware computer devices, the first submission data item with the first file data item; in response to receiving a second file data item; determining, by the one or more hardware computer devices, whether the received second file data item was previously received by comparing the received second file data item to the plurality of file data items; and generating, by the one or more hardware computer devices, a second submission data item; in response to determining that the second file data item matches the first data item that was previously received, associating, by the one or more hardware computer devices, the second submission data item with the first file data item that was previously received; and generating, by the one or more hardware computer devices, a user interface including one or more user selectable portions presenting various of the analysis information items associated with the first file data item, the user interface usable by an analyst to determine one or more characteristics of the first file data item, the one or more user selectable portions including a first selectable element, the first selectable element configured to cause, in response to an analyst input selecting the first selectable element, a generation of a graphical visualization including at least; a first graphical representation of a first node representing the first file data item, a second graphical representation of a second node representing the first submission data item, a third graphical representation of an edge connecting the first and second graphical representations and representing the association between the first file data item and the first submission data item, a fourth graphical representation of a third node representing the second submission data item, and a fifth graphical representation of a second edge connecting the first and fourth graphical representations and representing the association between the first file data item and the second submission data item. - View Dependent Claims (9, 10, 16)
-
-
11. A non-transitory computer-readable storage medium storing software instructions that, in response to execution by a computer system having one or more hardware processors, configure the computer system to perform operations comprising:
-
storing on one or more computer readable storage devices; a plurality of computer executable instructions; a plurality of file data items and submission data items, each submission data item associated with at least one file data item, each file data item comprising a suspected malware data item, each submission data item further including indications of at least; a filename of an associated file data item that was submitted, a date the associated file data item was submitted, and an identifier of the person who submitted the associated file data item; and a graph comprising nodes and edges, each of the nodes representing at least one of a file data item, a submission data item, an analysis data item, or another type of data item, each of the edges indicating an association between two of the nodes; in response to receiving a first file data item; determining whether the received first file data item was previously received by comparing the received first file data item to the plurality of file data items; and generating a first submission data item; and in response to determining that the first file data item was not previously received; initiating an analysis of the first file data item, wherein the analysis of the first file data item generates analysis information items, wherein the initiating analysis of the first file data item comprises initiating an internal analysis of the first file data item including at least calculation of a hash of the file data item; associating the analysis information items with the first file data item; and associating the first submission data item with the first file data item; in response to receiving a second file data item; determining whether the received second file data item was previously received by comparing the received second file data item to the plurality of file data items; and generating a second submission data item; in response to determining that the second file data item matches the first data item that was previously received, associating the second submission data item with the first file data item that was previously received; and generating a user interface including one or more user selectable portions presenting various of the analysis information items associated with the first file data item, the user interface usable by an analyst to determine one or more characteristics of the first file data item, the one or more user selectable portions including a first selectable element, the first selectable element configured to cause, in response to an analyst input selecting the first selectable element, a generation of a graphical visualization including at least; a first graphical representation of a first node representing the first file data item, a second graphical representation of a second node representing the first submission data item, a third graphical representation of an edge connecting the first and second graphical representations and representing the association between the first file data item and the first submission data item, a fourth graphical representation of a third node representing the second submission data item, and a fifth graphical representation of a second edge connecting the first and fourth graphical representations and representing the association between the first file data item and the second submission data item. - View Dependent Claims (12, 13, 14, 17)
-
Specification