Determining malware based on signal tokens
First Claim
Patent Images
1. A computing device comprising:
- a memory and at least one hardware processor to execute a plurality of modules including;
a static code analysis module to generate a set of tokens from an application under test according to obfuscation tolerant rules, wherein each token of the set of tokens is generated upon a hit to one of the obfuscation tolerant rules;
a signal generation module to generate a plurality of signal tokens from the set of tokens using a set of grouping rules, wherein each signal token is generated from a grouping of multiple tokens based on a grouping rule; and
a classification module to perform a Bayes classification to compare the plurality of signal tokens with a signal token database to determine a likelihood of whether malware is included in the application under test.
8 Assignments
0 Petitions
Accused Products
Abstract
Example embodiments disclosed herein relate to determining malware. A set of tokens is generated from an application under test, A set of signal tokens is generated from the set of tokens. A likelihood of malware is determined for the application under test based on the signal tokens and a signal token database.
20 Citations
17 Claims
-
1. A computing device comprising:
a memory and at least one hardware processor to execute a plurality of modules including; a static code analysis module to generate a set of tokens from an application under test according to obfuscation tolerant rules, wherein each token of the set of tokens is generated upon a hit to one of the obfuscation tolerant rules; a signal generation module to generate a plurality of signal tokens from the set of tokens using a set of grouping rules, wherein each signal token is generated from a grouping of multiple tokens based on a grouping rule; and a classification module to perform a Bayes classification to compare the plurality of signal tokens with a signal token database to determine a likelihood of whether malware is included in the application under test. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
8. A non-transitory machine-readable storage medium storing instructions that, if executed by at least one hardware processor of a device, cause the device to:
-
generate a set of tokens from an application under test according to obfuscation tolerant rules, wherein each token of the set of tokens is generated upon a hit to one of the obfuscation tolerant rules; generate a plurality of signal tokens from the set of tokens using a set of grouping rules, wherein each signal token is generated from a grouping of multiple tokens based on a grouping rule; use a Bayes classification technique to analyze the plurality of signal tokens with a signal token database including a second plurality of signal tokens to determine a likelihood that the application under test is malware, wherein each of the second plurality of signal tokens is preprocessed to have a likeliness of malware based on a training set; and determine that the application under test is malware if the likelihood is above a threshold level. - View Dependent Claims (9, 10, 16, 17)
-
-
11. A method comprising:
-
generating, by a hardware processor, a set of tokens from an application under test according to obfuscation tolerant rules, wherein each token of the set of tokens is generated upon a hit to one of the obfuscation tolerant rules; generating, by the hardware processor, a plurality of signal tokens from the set of tokens using a set of grouping rules, wherein each signal token is generated from a grouping of multiple tokens based on a grouping rule; using, by the hardware processor, a Bayesian technique to analyze the plurality of signal tokens with a signal token database including a second plurality of signal tokens to determine a likelihood that the application under test includes malware, wherein each of the second plurality of signal tokens is preprocessed to have a likeliness of malware based on a training set; and determining, by the hardware processor, that the application under test is malware if the likelihood is above a threshold level. - View Dependent Claims (12, 13, 14, 15)
-
Specification