Intrusion detection using MDL compression
First Claim
Patent Images
1. An intrusion masquerade detection method, comprising:
- a computer applying a compression algorithm to user data to build user grammars associated with a user;
forming at least one model by storing the user grammars in a database;
applying the compression algorithm to at least one target block to calculate an estimated algorithmic minimum sufficient statistic;
searching a string of data from the at least one target block for phrases matching user grammars contained in the at least one model;
sorting the user grammars so that longest phrases among the user grammars are applied first to an unclassified string;
converting each matching phrase to a variable-length code value by replacing each matching phrase with a corresponding variable-length code value;
attributing a cost for phrases that are not found in the at least one model by quantifying a cost of explicitly representing symbols associated with those phrases;
determining a degree of fit between the at least one target block and the at least one model based on the cost; and
detecting an intrusion masquerade based on the degree of fit.
6 Assignments
0 Petitions
Accused Products
Abstract
An intrusion masquerade detection system and method that includes a grammar inference engine. A grammar-based Minimum Description Length (MDL) compression algorithm is used to determine a masquerade based on a distance from a threshold in a model of an estimated algorithmic minimum sufficient statistic.
-
Citations
20 Claims
-
1. An intrusion masquerade detection method, comprising:
-
a computer applying a compression algorithm to user data to build user grammars associated with a user; forming at least one model by storing the user grammars in a database; applying the compression algorithm to at least one target block to calculate an estimated algorithmic minimum sufficient statistic; searching a string of data from the at least one target block for phrases matching user grammars contained in the at least one model; sorting the user grammars so that longest phrases among the user grammars are applied first to an unclassified string; converting each matching phrase to a variable-length code value by replacing each matching phrase with a corresponding variable-length code value; attributing a cost for phrases that are not found in the at least one model by quantifying a cost of explicitly representing symbols associated with those phrases; determining a degree of fit between the at least one target block and the at least one model based on the cost; and detecting an intrusion masquerade based on the degree of fit. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A machine-implemented grammar inference engine for intrusion detection, comprising:
-
a pre-processor apparatus that receives input data and outputs filtered data; a grammar generator apparatus coupled to the pre-processor apparatus and configured to generate grammars associated with a user by applying a compression algorithm to the filtered data, to form at least one model by storing the user grammars in a database, and to apply the compression algorithm to at least one target block to calculate an estimated algorithmic minimum sufficient statistic; a grammar applicator apparatus that searches a string of data from the at least one target block for phrases matching user grammars contained in the at least one model, sorts the user grammars so that longest phrases among the user grammars are applied first to an unclassified string, replaces each matching phrase with a variable-length code value, and attributes a cost for phrases that are not found in the at least one model by quantifying a cost of explicitly representing symbols associated with those phrases; and a classifier apparatus coupled to the grammar applicator apparatus and to a post-processor apparatus, the classifier apparatus receiving the cost from the grammar applicator apparatus and decision criteria from the post-processor apparatus, the classifier apparatus being configured to determine a degree of fit between the at least one target block and the at least one model based on the cost and the decision criteria, to detect an intrusion masquerade based on the degree of fit, and to output an indication of the detected intrusion masquerade, wherein the post-processor apparatus assigns each portion of the input data to one of the models. - View Dependent Claims (8, 9, 10)
-
-
11. A non-transitory machine-readable medium upon which is embodied and stored a sequence of programmable instructions which, when executed by a processor, cause the processor to perform intrusion masquerade detection operations, comprising:
-
applying a compression algorithm to user data to build user grammars associated with a user; forming at least one model by storing the user grammars in a database; applying the compression algorithm to at least one target block to calculate an estimated algorithmic minimum sufficient statistic; searching a string of data from the at least one target block for phrases matching user grammars contained in the at least one model; attributing a cost for phrases that are not found in the at least one model by quantifying a cost of explicitly representing symbols associated with those phrases; determining a degree of fit between the at least one target block and the at least one model based on the cost; detecting an intrusion masquerade based on the degree of fit; and outputting an indication of the detected intrusion masquerade. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification