Automatically determining whether malware samples are similar
First Claim
Patent Images
1. A computer-implemented method, comprising:
- receiving a plurality of samples for performing automated malware analysis to generate log files based on the automated malware analysis;
processing the log files to determine artifacts associated with malware, wherein a raw log file generated for each of the plurality of samples comprises one or more lines based on results of the automated malware analysis for each of the plurality of samples, and wherein processing the log files to determine artifacts associated with malware further comprises;
processing the raw log files for each of the plurality of samples to generate processed log files, wherein each of the processed log files provides a human readable format of the automated malware analysis; and
identifying distinct lines in each of the processed log files; and
comparing the processed log files based on the automated malware analysis based on a threshold comparison of a textual representation of one or more artifacts;
determining whether any of the plurality of samples are similar based on comparing the processed log files based on the automated malware analysis; and
performing an action based on determining that at least two samples are similar.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for automatically determining whether malware samples are similar are disclosed. In some embodiments, a system, process, and/or computer program product for automatically determining whether malware samples are similar includes receiving a plurality of samples for performing automated malware analysis to generate log files based on the automated malware analysis; comparing the log files based on the automated malware analysis; determining whether any of the plurality of samples are similar based on the comparison of the log files based on the automated malware analysis; and performing an action based on determining that at least two samples are similar.
-
Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
receiving a plurality of samples for performing automated malware analysis to generate log files based on the automated malware analysis; processing the log files to determine artifacts associated with malware, wherein a raw log file generated for each of the plurality of samples comprises one or more lines based on results of the automated malware analysis for each of the plurality of samples, and wherein processing the log files to determine artifacts associated with malware further comprises; processing the raw log files for each of the plurality of samples to generate processed log files, wherein each of the processed log files provides a human readable format of the automated malware analysis; and identifying distinct lines in each of the processed log files; and comparing the processed log files based on the automated malware analysis based on a threshold comparison of a textual representation of one or more artifacts; determining whether any of the plurality of samples are similar based on comparing the processed log files based on the automated malware analysis; and performing an action based on determining that at least two samples are similar. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
a processor configured to; receive a plurality of samples for performing automated malware analysis to generate log files based on the automated malware analysis; process the log files to determine artifacts associated with malware, wherein a raw log file generated for each of the plurality of samples comprises one or more lines based on results of the automated malware analysis for each of the plurality of samples, and wherein process the log files to determine artifacts associated with malware further comprises; process the raw log files for each of the plurality of samples to generate processed log files, wherein each of the processed log files provides a human readable format of the automated malware analysis; and identify distinct lines in each of the processed log files; and compare the processed log files based on the automated malware analysis; determine whether any of the plurality of samples are similar based on comparing the processed log files based on the automated malware analysis based on a threshold comparison of a textual representation of one or more artifacts; and perform an action based on determining that at least two samples are similar; and a memory coupled to the processor and configured to provide the processor with instructions. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product, the computer program product being embodied in a non-transitory tangible computer readable storage medium and comprising computer instructions for:
-
receiving a plurality of samples for performing automated malware analysis to generate log files based on the automated malware analysis; processing the log files to determine artifacts associated with malware, wherein a raw log file generated for each of the plurality of samples comprises one or more lines based on results of the automated malware analysis for each of the plurality of samples, and wherein processing the log files to determine artifacts associated with malware further comprises; processing the raw log files for each of the plurality of samples to generate processed log files, wherein each of the processed log files provides a human readable format of the automated malware analysis; and identifying distinct lines in each of the processed log files; and comparing the processed log files based on the automated malware analysis based on a threshold comparison of a textual representation of one or more artifacts; determining whether any of the plurality of samples are similar based on comparing the processed log files based on the automated malware analysis; and performing an action based on determining that at least two samples are similar. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification