Decision forest compilation
First Claim
1. A computer program product tangibly embodied on non-transient computer readable media, the computer program product comprising instructions operable when executed to:
- receive a file from a network location;
extract, by an extraction module implemented at least partially in hardware, a plurality of features of a file;
categorize, by a categorization module implemented at least partially in hardware, each of the plurality of features to define a plurality of categories of features, wherein features unrelated to one another are categorized into a same category to define a category of unrelated features;
build, by a tree generator module implemented at least partially in hardware, a first decision tree based on a first category from the plurality of categories, the first category comprising a set of related features of the file;
build, by the tree generator module, a second decision tree based on a second category from the plurality of categories, the second category comprising a set of unrelated features of the file;
execute, by an execution module implemented at least partially in hardware, the first decision tree to generate a first decision result;
execute, by the execution module, the second decision tree to generate a second decision result; and
determine, by a classification module implemented at least partially in hardware, whether the file has malware based on the first decision result and the second decision result.
10 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the present disclosure include methods, devices, and computer program products for detecting malware in a file. Embodiments include identifying a plurality of features of the file, categorizing each of the plurality of features to define a plurality of categories of features, building a first decision tree based on a first category from the plurality of categories, the first category comprising a first set of features of the file, and building a second decision tree based on a second category from the plurality of categories, the second decision tree comprising a second set of features of the file, the second set different from the first set. Some embodiments include comparing results from each decision tree to determine the presence or absence of malware.
11 Citations
20 Claims
-
1. A computer program product tangibly embodied on non-transient computer readable media, the computer program product comprising instructions operable when executed to:
-
receive a file from a network location; extract, by an extraction module implemented at least partially in hardware, a plurality of features of a file; categorize, by a categorization module implemented at least partially in hardware, each of the plurality of features to define a plurality of categories of features, wherein features unrelated to one another are categorized into a same category to define a category of unrelated features; build, by a tree generator module implemented at least partially in hardware, a first decision tree based on a first category from the plurality of categories, the first category comprising a set of related features of the file; build, by the tree generator module, a second decision tree based on a second category from the plurality of categories, the second category comprising a set of unrelated features of the file; execute, by an execution module implemented at least partially in hardware, the first decision tree to generate a first decision result; execute, by the execution module, the second decision tree to generate a second decision result; and determine, by a classification module implemented at least partially in hardware, whether the file has malware based on the first decision result and the second decision result. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer implemented method for assessing a file for malware, the method comprising:
-
receiving the file from a network location; extracting, by extraction logic implemented at least partially in hardware, a plurality of features of the file; categorizing, by categorization logic implemented at least partially in hardware, each of the plurality of features to define a plurality of categories of features, wherein features unrelated to one another are categorized into a same category to define a category of unrelated features; building, by tree generator logic implemented at least partially in hardware, a first decision tree based on a first category from the plurality of categories, the first category comprising a set of related features of the file; building, by the tree generator logic implemented at least partially in hardware, a second decision tree based on a second category from the plurality of categories, the second category comprising a set of unrelated features of the file executing, by execution logic implemented at least partially in hardware, the first decision tree to generate a first decision result; executing, by the execution logic implemented at least partially in hardware, the second decision tree to generate a second decision result; and determining, by classification logic implemented at least partially in hardware, whether the file has malware based on the first decision result and the second decision result. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computing device comprising:
-
extraction logic implemented at least partially in hardware to extract one or more features from a file; categorization logic implemented at least partially in hardware to categorize each of the plurality of features to define a plurality of categories of features, wherein features unrelated to one another are categorized into a same category to define a category of unrelated features; and tree generator logic implemented at least partially in hardware to; generate a first decision tree based on a first category from the plurality of categories, the first category comprising a set of related features of the file, and generate a second decision tree based on a second category from the plurality of categories, the second category comprising a set of unrelated features of the file; execution logic implemented at least partially in hardware to; execute the first decision tree to generate a first decision result, and execute the second decision tree to generate a second decision result; and classification logic implemented at least partially in hardware to determine whether the file has malware based on the first decision result and the second decision result. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification