Systems and methods for detecting malware using file clustering
First Claim
1. A computer-implemented method for detecting malware using file clustering, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising:
- identifying an unknown file with an unknown reputation;
identifying at least one known file with a known reputation that co-occurs with the unknown file;
identifying a classification assigned to the known file;
determining a probability that the unknown file is of the same classification as the known file;
assigning, based on the probability that the unknown file is of the same classification as the known file, the classification of the known file to the unknown file wherein identifying the unknown file comprises;
obtaining, from at least one client device, information that identifies the unknown file;
querying, using the information that identifies the unknown file, a file reputation database that associates file information with file reputations;
receiving, in response to querying the file reputation database, an indication that the unknown file'"'"'s reputation is unknown.
6 Assignments
0 Petitions
Accused Products
Abstract
The disclosed computer-implemented method for detecting malware using file clustering may include (1) identifying a file with an unknown reputation, (2) identifying at least one file with a known reputation that co-occurs with the unknown file, (3) identifying a malware classification assigned to the known file, (4) determining a probability that the unknown file is of the same classification as the known file, and (5) assigning, based on the probability that the unknown file is of the same classification as the known file, the classification of the known file to the unknown file. Various other methods, systems, and computer-readable media are also disclosed.
-
Citations
17 Claims
-
1. A computer-implemented method for detecting malware using file clustering, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising:
-
identifying an unknown file with an unknown reputation; identifying at least one known file with a known reputation that co-occurs with the unknown file; identifying a classification assigned to the known file; determining a probability that the unknown file is of the same classification as the known file; assigning, based on the probability that the unknown file is of the same classification as the known file, the classification of the known file to the unknown file wherein identifying the unknown file comprises; obtaining, from at least one client device, information that identifies the unknown file; querying, using the information that identifies the unknown file, a file reputation database that associates file information with file reputations; receiving, in response to querying the file reputation database, an indication that the unknown file'"'"'s reputation is unknown. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for detecting malware using file clustering, the system comprising:
-
an identification module, stored in memory, that; identifies an unknown file with an unknown reputation; identifies at least one known file with a known reputation that co-occurs with the unknown file; a reputation module, stored in memory, that identifies a classification assigned to the known file; an evaluation module, stored in memory, that determines a probability that the unknown file is of the same classification as the known file; a classification module, stored in memory, that assigns, based on the probability that the unknown file is of the same classification as the known file, the classification of the known file to the unknown file; at least one physical processor configured to execute the identification module, the reputation module, the evaluation module, and the classification module; wherein the identification module identifies the unknown file by; obtaining, from at least one additional client device, information that identifies the unknown file; querying, using the information that identifies the unknown file, a file reputation database that associates file information with file reputations; receiving, in response to querying the file reputation database, an indication that the unknown file'"'"'s reputation is unknown. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer-readable medium comprising one or more computer-readable instructions that, when executed by at least one processor of a computing device, cause the computing device to:
-
identify an unknown file with an unknown reputation; identify at least one known file with a known reputation that co-occurs with the unknown file; identify a classification assigned to the known file; determine a probability that the unknown file is of the same classification as the known file; assign, based on the probability that the unknown file is of the same classification as the known file, the classification of the known file to the unknown file wherein the one or more computer-readable instructions cause the computing device to determine the probability that the unknown file is of the same classification as the known file by clustering a set of client devices on which the known file occurs and a set of client devices on which the unknown file occurs using at least one hashing function that assigns sets of client devices to clusters according to a client device selected from the set of client devices on which the known file or the unknown file occur.
-
Specification