High speed data stream pattern recognition
First Claim
1. A system for recognizing a particular pattern formed by a plurality of bytes in a stream of bytes, said system operating in a series of states, and said system including:
- a plurality of lookup tables, one for each of said plurality of bytes, each lookup table having a section associated with each state;
a first index table which has index data associated with each state, the index data associated with each state comprising a plurality of index bits; and
a next state table which has a data word associated with each state, each data word having data that specifies the next state and/or a special flag, said system performing the following steps in each particular state of a series of states;
a) interrogating a particular section of each lookup table utilizing the value of the associated bytes to determine the value of a series of bits, said particular section of each lookup table being determined by the particular state;
b) generating a pointer from a combination of the value of the series of bits determined by said lookup tables and the bits in current state location in said first index table, said pointer specifying the address of a particular data word in said next state table;
c) reading the particular data word specified by the pointer generated in step “
b” and
based on the contents of said data word proceeding to the next state specified by said data word and/or performing special operations in response to said flag; and
d) repeating steps “
a”
, “
b” and
“
c”
until an end of operation is indicated by said flag.
13 Assignments
0 Petitions
Accused Products
Abstract
A system and method in accordance with the present invention determines in real-time the portions of a set of characters from a data or character stream which satisfies one or more predetermined regular expressions. A Real-time Deterministic Finite state Automaton (RDFA) ensures that the set of characters is processed at high speeds with relatively small memory requirements. An optimized state machine models the regular expression(s) and state related alphabet lookup and next state tables are generated. Characters from the data stream are processed in parallel using the alphabet lookup and next state tables, to determine whether to transition to a next state or a terminal state, until the regular expression is satisfied or processing is terminated. Additional means may be implemented to determine a next action from satisfaction of the regular expression.
137 Citations
31 Claims
-
1. A system for recognizing a particular pattern formed by a plurality of bytes in a stream of bytes, said system operating in a series of states, and said system including:
-
a plurality of lookup tables, one for each of said plurality of bytes, each lookup table having a section associated with each state;
a first index table which has index data associated with each state, the index data associated with each state comprising a plurality of index bits; and
a next state table which has a data word associated with each state, each data word having data that specifies the next state and/or a special flag, said system performing the following steps in each particular state of a series of states; a) interrogating a particular section of each lookup table utilizing the value of the associated bytes to determine the value of a series of bits, said particular section of each lookup table being determined by the particular state;
b) generating a pointer from a combination of the value of the series of bits determined by said lookup tables and the bits in current state location in said first index table, said pointer specifying the address of a particular data word in said next state table;
c) reading the particular data word specified by the pointer generated in step “
b” and
based on the contents of said data word proceeding to the next state specified by said data word and/or performing special operations in response to said flag; and
d) repeating steps “
a”
, “
b” and
“
c”
until an end of operation is indicated by said flag.- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. The method of recognizing a particular pattern formed by a plurality of bytes in stream of bytes, said method operating in a series of states, and utilizing:
-
a plurality of lookup tables, one for each of said plurality of bytes, each lookup table having a section associated with each state;
a first index table for storing a plurality of index bits associated with each particular state; and
a next state table for storing data words, each of which indicates the next state and/or a special operation flag, said method including the following steps in each particular state in a series of states; a) interrogating a particular section of each lookup table utilizing the value of the associated byte to determine the value of a series of bits, said particular section of each lookup table being based on the particular state;
b) generating a pointer which specifies the address of a data word in said next state table from a combination of the selected values from the lookup tables and a value from the first index table associated with the particular state;
c) reading the particular data work specified in step “
b” and
based on the contents of said data word proceeding to the next state and/or performing operations indicated by a flag in said data work; and
d) repeating steps “
a”
, “
b” and
“
c”
until a flag in said next state table indicates an end of the operation.- View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. An integrated circuit for recognizing a plurality of characters in parallel, said integrated circuit operating in series of states, and said integrated circuit including:
-
a plurality of lookup tables, one for each of said characters, each lookup table having a section associated with each state;
an index table for storing a plurality of index bits for each state; and
a next state table for storing data words, each of which indicates the next state of said state machine, said integrated circuit performing the following steps in a plurality of particular states; a) interrogating the section of said lookup table associated with said particular state and generating a series of bits based on the value of the associated character;
b) generating a pointer which specifies the address of a data word in said next state table from a combination of the selected values from the lookup tables and a value from said first index table;
c) reading the particular data word specified in step “
b” and
proceeding to the state specified by said data word; and
d) repeating steps “
a”
, “
b” and
“
c”
until said operation ends.- View Dependent Claims (28, 29)
-
-
30. A method for evaluating in parallel a plurality of characters in a stream of bytes, said method operating in a series of states and utilizing a set of alphabet lookup tables, one lookup table for each of said plurality of characters, each lookup table having a section associated with each of said states, and a next state table, said method comprising:
-
a) selecting an alphabet lookup table from said set of alphabet tables, as a function of a current state and byte position within said set of bytes, and obtaining a class code from said selected alphabet table corresponding to the character represented by said byte;
b) selecting a value from said next state table as a function of the current state, and generating an indication of the next state from said class codes of said set of bytes and from said value from said next state table; and
c) repeating steps “
a” and
“
b”
until said state machine arrives at acceptance state or until entering a failure state.
-
-
31. A system for recognizing a particular pattern formed by a number of bytes in a stream of bytes, said number being at least one, said system operating in a series of states, and said system including:
-
a lookup table for each byte in said number of bytes, each lookup table having a section associated with each state;
a first index table which has index data associated with each state, the index data associated with each state comprising a plurality of index bits; and
a next state table which has a data word associated with each state, each data word having data that specifies the next state and/or a special flag;
said system performing the following steps in each particular state of a series of states; a) interrogating a particular section of each lookup table utilizing the value of each of said bytes to determine the value of a series of bits, said particular section of each lookup table being determined by the particular state;
b) generating a pointer from a combination of the series of bits generated by the lookup tables and bits in the current state location in said first index table, said pointer specifying the address of a particular data word in said next state table;
c) reading the particular data word specified by the pointer generated in step “
b” and
based on the contents of said data word proceeding to the next state specified by said data word and/or performing special operations in response to said flag; and
d) repeating steps “
a”
, “
b” and
“
c”
until an end of operation is indicated by said flag.
-
Specification