Methods and systems for multi-pattern searching
First Claim
1. A method for searching for search patterns in a text sequence using a state table of a state machine algorithm in an intrusion detection system, wherein the state table includes a state transition table and an array of per state pattern matching lists, comprising:
- reading a character from the text sequence as a search pattern;
reading an element of a current state vector of a current state of the state transition table that corresponds to the character, the state transition table and the array of per state pattern matching lists being separate data structures;
checking a pattern matching element of the current state vector for a matching pattern flag, when the element of the current state vector of the current state of the state transition table that corresponds to the character has a nonzero value, the state transition table and the array of per state pattern matching lists being separate data structures;
processing, in an intrusion detection processor device, the patterns in the text sequence that match pattern matching lists of the current state to detect an intrusion, when the matching pattern flag is set;
selecting the current state vector corresponding to the nonzero value of the element;
reading a next character from the text sequence, wherein the state transition table has a list of pointers to state vectors, each state vector representing valid transitions for the state, each state vector further includes an indicator that identifies a storage format of the element of the state transition table as being in a full vector form where a state vector with a predetermined full number of elements, or in a banded-row form where a state vector has a variable number of elements reduced from the predetermined full number;
checking the indicator of the current state vector that identifies the storage format;
determining if the character is in a band from an element of the current state vector identifying a number of elements in the band, an element of the current state vector identifying an index of a first element of the band, and elements of the band, when the storage format is banded-row form;
then,selecting the current state vector corresponding to the nonzero value of the element, when the character is in the band and the storage format is banded-row form;
returning the current state vector to an initial state, when the character is not in the band and the storage format is banded-row form; and
handling the character in accordance with a non-banded row format, when the storage format is not banded row form.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the present invention relate to systems and methods for optimizing and reducing the memory requirements of state machine algorithms in pattern matching applications. Memory requirements of an Aho-Corasick algorithm are reduced in an intrusion detection system by representing the state table as three separate data structures. Memory requirements of an Aho-Corasick algorithm are also reduced by applying a banded-row sparse matrix technique to the state transition table of the state table. The pattern matching performance of the intrusion detection system is improved by performing a case insensitive search, where the characters of the test sequence are converted to uppercase as the characters are read. Testing reveals that state transition tables with sixteen bit elements outperform state transition tables with thirty-two bit elements and do not reduce the functionality of intrusion detection system using the Aho-Corasick algorithm.
-
Citations
11 Claims
-
1. A method for searching for search patterns in a text sequence using a state table of a state machine algorithm in an intrusion detection system, wherein the state table includes a state transition table and an array of per state pattern matching lists, comprising:
-
reading a character from the text sequence as a search pattern; reading an element of a current state vector of a current state of the state transition table that corresponds to the character, the state transition table and the array of per state pattern matching lists being separate data structures; checking a pattern matching element of the current state vector for a matching pattern flag, when the element of the current state vector of the current state of the state transition table that corresponds to the character has a nonzero value, the state transition table and the array of per state pattern matching lists being separate data structures; processing, in an intrusion detection processor device, the patterns in the text sequence that match pattern matching lists of the current state to detect an intrusion, when the matching pattern flag is set; selecting the current state vector corresponding to the nonzero value of the element; reading a next character from the text sequence, wherein the state transition table has a list of pointers to state vectors, each state vector representing valid transitions for the state, each state vector further includes an indicator that identifies a storage format of the element of the state transition table as being in a full vector form where a state vector with a predetermined full number of elements, or in a banded-row form where a state vector has a variable number of elements reduced from the predetermined full number; checking the indicator of the current state vector that identifies the storage format; determining if the character is in a band from an element of the current state vector identifying a number of elements in the band, an element of the current state vector identifying an index of a first element of the band, and elements of the band, when the storage format is banded-row form;
then,selecting the current state vector corresponding to the nonzero value of the element, when the character is in the band and the storage format is banded-row form; returning the current state vector to an initial state, when the character is not in the band and the storage format is banded-row form; and handling the character in accordance with a non-banded row format, when the storage format is not banded row form. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable medium comprising instructions being executed by a computer, the instructions including a computer-implemented method for searching for search patterns in a text sequence using a state table of a state machine algorithm in an intrusion detection system, wherein the state table includes a state transition table and an array of per state pattern matching lists, the instructions for implementing the steps of:
-
reading a character from the text sequence as a search pattern; reading an element of a current state vector of a current state of the state transition table that corresponds to the character, the state transition table and the array of per state pattern matching lists being separate data structures; checking a pattern matching element of the current state vector for a matching pattern flag, when the element has a nonzero value, the state transition table and the array of per state pattern matching lists being separate data structures; processing the patterns in the text sequence that match pattern matching lists of the current state, when the matching pattern flag is set, to detect an intrusion; selecting the current state vector corresponding to the nonzero value of the element; reading a next character from the text sequence, wherein the state transition table has a list of pointers to state vectors, each state vector representing valid transitions for the state, each state vector further includes an indicator that identifies a storage format of the element of the state transition table as being in a full vector form where a state vector with a predetermined full number of elements, or in a banded-row form where a state vector has a variable number of elements reduced from the predetermined full number; checking the indicator of the current state vector that identifies the storage format; determining whether the character is in a band from an element of the current state vector identifying umber of elements in the band, an element of the current state vector identifying an index of a first element of the band, and elements of the band, when the storage format is banded row form; and
thereafterselecting the current state vector corresponding to the nonzero value of the element, when the character in the band; returning the current state vector to an initial state when the storage format is banded row form and the character is not in the band; and handling the character in accordance with non-banded row format when the storage format is not banded row form. - View Dependent Claims (9, 10, 11)
-
Specification