Intrusion Detection Using MDL Compression

US 20100107255A1
Filed: 03/05/2009
Published: 04/29/2010
Est. Priority Date: 10/29/2008
Status: Active Grant

First Claim

Patent Images

1. An intrusion masquerade detection method comprising:

a computer applying a compression algorithm to user data to build user grammars associated with a user;

forming at least one model by storing said user grammars using a database;

applying said compression algorithm to at least one target block to calculate an estimated algorithmic minimum sufficient statistic;

searching a string of data from said target block for phrases matching user grammars contained in said at least one model;

sorting the user grammars so that longest phrases among said user grammars are applied first to an unclassified string;

converting each matching phrase to a variable-length code value by replacing each said matching phrase with a corresponding variable-length code value;

attributing a cost for phrases that are not found in the at least one model by quantifying the cost of explicitly representing symbols associated with those phrases;

determining a degree of fit between said target block and said at least one model based on said cost; and

detecting an intrusion masquerade based on said degree of fit.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An intrusion masquerade detection system and method that includes a grammar inference engine. A grammar-based Minimum Description Length (MDL) compression algorithm is used to determine a masquerade based on a distance from a threshold in a model of an estimated algorithmic minimum sufficient statistic.

Citations

20 Claims

1. An intrusion masquerade detection method comprising:
- a computer applying a compression algorithm to user data to build user grammars associated with a user;
  
  forming at least one model by storing said user grammars using a database;
  
  applying said compression algorithm to at least one target block to calculate an estimated algorithmic minimum sufficient statistic;
  
  searching a string of data from said target block for phrases matching user grammars contained in said at least one model;
  
  sorting the user grammars so that longest phrases among said user grammars are applied first to an unclassified string;
  
  converting each matching phrase to a variable-length code value by replacing each said matching phrase with a corresponding variable-length code value;
  
  attributing a cost for phrases that are not found in the at least one model by quantifying the cost of explicitly representing symbols associated with those phrases;
  
  determining a degree of fit between said target block and said at least one model based on said cost; and
  
  detecting an intrusion masquerade based on said degree of fit.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The intrusion detection method of claim 1, wherein said searching is performed in real-time.
  - 3. The intrusion detection method of claim 1,wherein said variable-length code is a Huffman code, andwherein said compression algorithm is a grammar-based compression algorithm that estimates Kolmogorov complexity and which forms compressive grammar based on Minimum Description Length (MDL) principles.
  - 4. The intrusion detection method of claim 1,wherein said forming at least one model is performed using a steepest descent method;
    - andwherein said at least one model comprises a healthy session model.
  - 5. The intrusion detection method of claim 1, further comprising:
    - outputting an indication of an intrusion masquerade.
  - 6. The intrusion detection method of claim 1,wherein said detecting an intrusion masquerade includes calculating an inverse compression ratio over a time period for user data, and comparing said calculated inverse compression ratio to at least one inverse compression ratio associated with a compressed data set of at least one said model, andwherein said detecting an intrusion masquerade event is based on a difference between said calculated inverse compression ratio and said inverse compression ratio associated with a compressed data set, said difference exceeding a threshold.

7. A machine-implemented grammar inference engine for intrusion detection, comprising:
- a pre-processor apparatus that receives input data and is configured to output filtered data;
  
  a grammar generator apparatus coupled to the pre-processor apparatus and configured to generate grammars associated with a user by applying a compression algorithm to the filtered data, to form at least one model by storing said user grammars using a database, and to apply said compression algorithm to at least one target block to calculate an estimated algorithmic minimum sufficient statistic;
  
  a grammar applicator apparatus that searches a string of data from said at least one target block for phrases matching user grammars contained in said at least one model, sorts the user grammars so that longest phrases among said user grammars are applied first to an unclassified string, replaces each matching phrase with a variable-length code value, and attributes a cost for phrases that are not found in the at least one model by quantifying the cost of explicitly representing symbols associated with those phrases; and
  
  a classifier apparatus coupled to the grammar applicator apparatus and to a post-processor apparatus, wherein the classifier apparatus receives said cost from said grammar applicator apparatus and decision criteria from said post-processor apparatus, wherein the classifier apparatus is configured to determine a degree of fit between said at least one target block and said at least one model based on said cost and said decision criteria, to detect an intrusion masquerade based on said degree of fit, and to output an indication of an intrusion masquerade,wherein said post-processor apparatus assigns each portion of the input data to one of said models.
- View Dependent Claims (8, 9, 10)
- - 8. The grammar inference engine of claim 7, further comprising:
    - a grammar database coupled to the grammar applicator apparatus and to the grammar generator apparatus; and
      
      an input database coupled to an output of the pre-processor apparatus,wherein the grammar applicator apparatus is configured to receive filtered data processed by the pre-processor apparatus from the input database.
  - 9. The grammar inference engine of claim 7, wherein said compression algorithm is a grammar-based compression algorithm that estimates Kolmogorov complexity and which forms compressive grammar based on Minimum Description Length (MDL) principles.
  - 10. The grammar inference engine of claim 7, wherein the pre-processor is further configured to apply a sliding window protocol to segment portions of said input data.

11. A machine-readable medium upon which is embodied and stored a sequence of programmable instructions which, when executed by a processor, cause the processor to perform intrusion masquerade detection operations comprising:
- applying a compression algorithm to user data to build user grammars associated with a user;
  
  forming at least one model by storing said user grammars using a database;
  
  applying said compression algorithm to at least one target block to calculate an estimated algorithmic minimum sufficient statistic;
  
  searching a string of data from said target block for phrases matching user grammars contained in said at least one model;
  
  attributing a cost for phrases that are not found in the at least one model by quantifying the cost of explicitly representing symbols associated with those phrases;
  
  determining a degree of fit between said target block and said at least one model based on said cost; and
  
  detecting an intrusion masquerade based on said degree of fit; and
  
  outputting an indication of an intrusion masquerade.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The machine-readable medium of claim 11, wherein the operations further comprise:
    - sorting the user grammars so that longest phrases among said user grammars are applied first to an unclassified string.
  - 13. The machine-readable medium of claim 12, wherein the operations further comprise:
    - replacing each matching phrase with a variable-length code value.
  - 14. The machine-readable medium of claim 13, wherein said variable-length code is a Huffman code.
  - 15. The machine-readable medium of claim 11, wherein said searching is performed in real-time.
  - 16. The machine-readable medium of claim 11, wherein said forming at least one model is performed using a steepest descent method.
  - 17. The machine-readable medium of claim 11, wherein said at least one model comprises a healthy session model.
  - 18. The machine-readable medium of claim 11,wherein said detecting an intrusion masquerade includes calculating an inverse compression ratio over a time period for user data, and comparing said calculated inverse compression ratio to at least one inverse compression ratio associated with a compressed data set of at least one said model, andwherein said detecting an intrusion masquerade event is based on a difference between said calculated inverse compression ratio and said inverse compression ratio associated with a compressed data set, said difference exceeding a threshold.
  - 19. The machine-readable medium of claim 11, wherein said compression algorithm is a grammar-based compression algorithm that estimates Kolmogorov complexity and which forms compressive grammar based on Minimum Description Length (MDL) principles.
  - 20. The machine-readable medium of claim 11, wherein said target block comprises a plurality of information packets of an information system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Leidos Innovations Technology, Inc.
Original Assignee
Edward E. Eiland, Jeremy D. Impson, Scott C. Evans, Thomas S. Markham
Inventors
Markham, Thomas S., Eiland, Edward E., Evans, Scott C., Impson, Jeremy D.

Granted Patent

US 8,375,446 B2
Time in Patent Office

Days
Field of Search
US Class Current

726/23
CPC Class Codes

G06F 21/55 Detecting local intrusion o...

Intrusion Detection Using MDL Compression

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Intrusion Detection Using MDL Compression

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links