Systems and methods for detecting malware
First Claim
1. A computer-implemented method for detecting malware, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising:
- identifying a behavioral trace of a program, the behavioral trace comprising a sequence of runtime behaviors exhibited by the program;
dividing the behavioral trace to identify a plurality of n-grams within the behavioral trace, each runtime behavior within the sequence of runtime behaviors corresponding to an n-gram token;
analyzing the plurality of n-grams to generate a feature vector of the behavioral trace comprising;
applying, for each given n-gram in the plurality of n-grams, a feature function to the behavioral trace that describes an occurrence characteristic of the given n-gram within the behavioral trace; and
including a result of the feature function in the feature vector; and
classifying the program based at least in part on the feature vector of the behavioral trace to determine whether the program is malicious;
wherein;
the feature vector comprises a plurality of dimensions, each n-gram within the plurality of n-grams corresponding to a dimension within the plurality of dimensions;
the plurality of n-grams map to the plurality of dimensions according to a non-injective surjection; and
including the result of the feature function in the feature vector comprises aggregating a subset of outputs of the feature function derived from a subset of the plurality of n-grams into a value and assigning the value to a dimension within the plurality of dimensions according to the non-injective surjection.
6 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for detecting malware may include (1) identifying a behavioral trace of a program, the behavioral trace including a sequence of runtime behaviors exhibited by the program, (2) dividing the behavioral trace to identify a plurality of n-grams within the behavioral trace, each runtime behavior within the sequence of runtime behaviors corresponding to an n-gram token, (3) analyzing the plurality of n-grams to generate a feature vector of the behavioral trace, and (4) classifying the program based at least in part on the feature vector of the behavioral trace to determine whether the program is malicious. Various other methods, systems, and computer-readable media are also disclosed.
-
Citations
20 Claims
-
1. A computer-implemented method for detecting malware, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising:
-
identifying a behavioral trace of a program, the behavioral trace comprising a sequence of runtime behaviors exhibited by the program; dividing the behavioral trace to identify a plurality of n-grams within the behavioral trace, each runtime behavior within the sequence of runtime behaviors corresponding to an n-gram token; analyzing the plurality of n-grams to generate a feature vector of the behavioral trace comprising; applying, for each given n-gram in the plurality of n-grams, a feature function to the behavioral trace that describes an occurrence characteristic of the given n-gram within the behavioral trace; and including a result of the feature function in the feature vector; and classifying the program based at least in part on the feature vector of the behavioral trace to determine whether the program is malicious; wherein; the feature vector comprises a plurality of dimensions, each n-gram within the plurality of n-grams corresponding to a dimension within the plurality of dimensions; the plurality of n-grams map to the plurality of dimensions according to a non-injective surjection; and including the result of the feature function in the feature vector comprises aggregating a subset of outputs of the feature function derived from a subset of the plurality of n-grams into a value and assigning the value to a dimension within the plurality of dimensions according to the non-injective surjection. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system for detecting malware, the system comprising:
-
an identification module, stored in memory, that identifies a behavioral trace of a program, the behavioral trace comprising a sequence of runtime behaviors exhibited by the program; a division module, stored in memory, that divides the behavioral trace to identify a plurality of n-grams within the behavioral trace, each runtime behavior within the sequence of runtime behaviors corresponding to an n-gram token; an analysis module, stored in memory, that analyzes the plurality of n-grams to generate a feature vector of the behavioral trace comprising; applying, for each given n-gram in the plurality of n-grams, a feature function to the behavioral trace that describes an occurrence characteristic of the given n-gram within the behavioral trace; and including a result of the feature function in the feature vector; wherein; the feature vector comprises a plurality of dimensions, each n-gram within the plurality of n-grams corresponding to a dimension within the plurality of dimensions; the plurality of n-grams map to the plurality of dimensions according to a non-injective surjection; and including the result of the feature function in the feature vector comprises aggregating a subset of outputs of the feature function derived from a subset of the plurality of n-grams into a value and assigning the value to a dimension within the plurality of dimensions according to the non-injective surjection; a classification module, stored in memory, that classifies the program based at least in part on the feature vector of the behavioral trace to determine whether the program is malicious; and at least one physical processor configured to execute the identification module, the division module, the analysis module, and the classification module. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A non-transitory computer-readable medium comprising one or more computer-readable instructions that, when executed by at least one processor of a computing device, cause the computing device to:
-
identify a behavioral trace of a program, the behavioral trace comprising a sequence of runtime behaviors exhibited by the program; divide the behavioral trace to identify a plurality of n-grams within the behavioral trace, each runtime behavior within the sequence of runtime behaviors corresponding to an n-gram token; analyze the plurality of n-grams to generate a feature vector of the behavioral trace comprising; applying, for each given n-gram in the plurality of n-grams, a feature function to the behavioral trace that describes an occurrence characteristic of the given n-gram within the behavioral trace; including a result of the feature function in the feature vector; and classifying the program based at least in part on the feature vector of the behavioral trace to determine whether the program is malicious wherein; the feature vector comprises a plurality of dimensions, each n-gram within the plurality of n-grams corresponding to a dimension within the plurality of dimensions; the plurality of n-grams map to the plurality of dimensions according to a non-injective surjection; and including the result of the feature function in the feature vector comprises aggregating a subset of outputs of the feature function derived from a subset of the plurality of n-grams into a value and assigning the value to a dimension within the plurality of dimensions according to the non-injective surjection. - View Dependent Claims (19, 20)
-
Specification