Method and apparatus for processing finite automata
First Claim
Patent Images
1. A security appliance operatively coupled to a network, the security appliance comprising:
- at least one memory;
at least one processor operatively coupled to the at least one memory, the at least one processor configured to;
walk characters of a payload in an input stream through a unified deterministic finite automata (DFA) stored in the at least one memory, by traversing nodes of the unified DFA with characters from the payload, the unified DFA generated from subpatterns selected from each pattern in a set of one or more regular expression patterns based on at least one heuristic; and
walk characters of the payload through at least one non-deterministic finite automata (NFA) stored in the at least one memory, by traversing nodes of the at least one NFA with characters from the payload, the at least one NFA generated for at least one pattern in the set, a portion of the at least one pattern used for generating the at least one NFA, and at least one walk direction for walking characters through the at least one NFA, being based on whether a length of a subpattern selected from the at least one pattern is fixed or variable and a location of the subpattern selected within the at least one pattern to optimize performance of run time processing of the at least one processor for identifying an existence of the at least one pattern in the input stream.
6 Assignments
0 Petitions
Accused Products
Abstract
A method and corresponding apparatus for run time processing use a Deterministic Finite Automata (DFA) and Non-Deterministic Finite Automata (NFA) to find the existence of a pattern in a payload. A subpattern may be selected from each pattern in a set of one or more regular expression patterns based on at least one heuristic. The DFA may be generated from selected subpatterns from all patterns in the set, and at least one NFA may be generated for at least one pattern in the set, optimizing run time performance of the run time processing.
-
Citations
67 Claims
-
1. A security appliance operatively coupled to a network, the security appliance comprising:
-
at least one memory; at least one processor operatively coupled to the at least one memory, the at least one processor configured to; walk characters of a payload in an input stream through a unified deterministic finite automata (DFA) stored in the at least one memory, by traversing nodes of the unified DFA with characters from the payload, the unified DFA generated from subpatterns selected from each pattern in a set of one or more regular expression patterns based on at least one heuristic; and walk characters of the payload through at least one non-deterministic finite automata (NFA) stored in the at least one memory, by traversing nodes of the at least one NFA with characters from the payload, the at least one NFA generated for at least one pattern in the set, a portion of the at least one pattern used for generating the at least one NFA, and at least one walk direction for walking characters through the at least one NFA, being based on whether a length of a subpattern selected from the at least one pattern is fixed or variable and a location of the subpattern selected within the at least one pattern to optimize performance of run time processing of the at least one processor for identifying an existence of the at least one pattern in the input stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
2. The security appliance of claim 1, wherein the at least one processor is further configured to report a match of the at least one pattern in the payload based on traversing an NFA node, of the at least one NFA, associated with metadata indicating a final match of the at least one pattern.
-
3. The security appliance of claim 1, wherein the at least one processor is further configured to:
-
associate a transaction identifier for a given walk of the DFA and the at least one NFA for matching the at least one pattern in the payload; and report a match of the at least one pattern in the payload based on; traversing a DFA node of the unified DFA having metadata indicating a DFA partial match of the at least one pattern; subsequently traversing at least one NFA node of the at least one NFA having metadata indicating an NFA partial match of the at least one pattern; and correlating the traversing and the subsequent traversing with the transaction identifier.
-
-
4. The security appliance of claim 1, wherein the at least one processor is further configured to report an offset, of a character in the payload matching a first element of the at least one pattern, as a start offset for the at least one pattern in the payload, based on:
-
metadata associated with an NFA node of the at least one NFA and indicating a final match for the at least one pattern in the payload; and metadata associated with a DFA node of the unified DFA and indicating (i) a length, of the subpattern selected for the at least one pattern, and (ii) a subpattern end offset, of a subpattern character in the payload matching a last element of the subpattern selected for the at least one pattern, at the DFA node, the start offset being determined by the at least one processor based on subtracting the length from the subpattern end offset.
-
-
5. The security appliance of claim 1, wherein the at least one processor is further configured to report an offset, of a character in the payload matching a first element of the at least one pattern, at an NFA node of the at least one NFA, as a start offset for the at least one pattern in the payload, based on correlating partial match results indicated in metadata associated with nodes of the unified DFA and the at least one NFA for the at least one pattern.
-
6. The security appliance of claim 1, wherein the at least one processor is further configured to report an offset, of a character in the payload matching a first element of the at least one pattern, at an NFA node of the at least one NFA, as a start offset for the at least one pattern in the payload, based on metadata associated with the NFA node and a final match determined for the at least one pattern in the payload at the NFA node.
-
7. The security appliance of claim 1, wherein the at least one heuristic includes maximizing a number of unique subpatterns selected and length of each subpattern selected, the length of each subpattern selected having at least a minimum threshold length.
-
8. The security appliance of claim 1, wherein if a first element of the subpattern selected is a first element of the at least one pattern and the length of the subpattern selected is fixed, the location of the subpattern selected is a beginning-location of the at least one pattern, the portion of the at least one pattern used for generating the at least one NFA is the at least one pattern excluding the subpattern selected, the at least one NFA is a single NFA, and the at least one walk direction of the at least one NFA is a forward walk direction.
-
9. The security appliance of claim 8, wherein the unified DFA includes:
a DFA node associated with the last element of the subpattern selected and metadata indicating to the at least one processor a pointer to a starting node of the at least one NFA, an instruction to transition to walk the at least one NFA in a forward walk direction, the starting node of the at least one NFA associated with a first element of the portion of the at least one pattern used for generating the at least one NFA, a payload starting offset of the at least one NFA associated with an offset of a byte subsequent to another byte at the end offset of the subpattern selected, and to report a match of the subpattern selected, a lead offset within the payload, of a lead character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected.
-
10. The security appliance of claim 8, wherein the at least one NFA includes:
an NFA node associated with metadata indicating to the at least one processor an instruction to terminate the walk, the NFA node associated with a last element of the at least one pattern, and to report a lag offset within the payload, of a lag character matching at the NFA node, as an end offset of the at least one pattern and a final match of the at least one pattern.
-
11. The security appliance of claim 1, wherein if a first element of the subpattern selected is not a first element of the at least one pattern and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the length of the subpattern selected is fixed:
-
the portion of the at least one pattern for generating the at least one NFA includes a lag portion and a lead portion of the at least one pattern, the lag portion of the at least one pattern is the at least one pattern excluding the subpattern selected and the lead portion of the at least one pattern, the lead portion of the at least one pattern excludes the subpattern selected and the lag portion of the at least one pattern; and the at least one NFA includes a lag NFA and a lead NFA, the at least one walk direction includes a forward walk direction and a reverse walk direction, the lag NFA having the forward walk direction, the lead NFA having the reverse walk direction, the lag portion of the at least one pattern used for generating the lag NFA and the lead portion of the at least one pattern used for generating the lead NFA.
-
-
12. The security appliance of claim 11, wherein the unified DFA includes:
a DFA node associated with the last element of the subpattern selected associated with metadata indicating to the at least one processor, a pointer to a starting node of the lag NFA, an instruction to transition to walk the lag NFA in the forward walk direction, the starting node of the lag NFA associated with a first element of the lag portion, a pointer to a starting node of the lead NFA, an instruction to transition to walk the lead NFA in the reverse walk direction, the starting node of the lead NFA associated with a last element of the lead portion, and to report an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, a match of the subpattern selected, and a length of the subpattern selected.
-
13. The security appliance of claim 11, wherein the at least one NFA includes:
-
a lag node of the lag NFA, associated with the last element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to terminate walking the lag NFA, and to report a lag offset within the payload, of a lag character of the payload matching the last element at the lag node, and a match of the lag portion of the at least one pattern; and a lead node of the lead NFA, associated with the first element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to terminate walking the lead NFA and to report a match of the lead portion of the at least one pattern and a lead offset within the payload, of a lead character of the payload matching the first element at the lead node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
-
-
14. The security appliance of claim 1, wherein if a first element of the subpattern selected is not a first element of the at least one pattern, and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the first element of the subpattern selected is the first element of the at least one pattern, the location of the subpattern selected is the beginning-location of the at least one pattern, and if the length of the subpattern is fixed or variable:
-
the portion of the at least one pattern for generating the at least one NFA includes a lag portion and an entire portion of the at least one pattern, the lag portion of the at least one pattern is the at least one pattern excluding a lead portion of the at least one pattern, the lead portion including the first element of the at least one pattern, the last element of the subpattern selected, and all elements in the at least one pattern therebetween, the entire portion of the at least one pattern is the at least one pattern, the lead portion being the subpattern selected if the location of the subpattern selected is a beginning-location; and the at least one NFA includes a lag NFA and an umbrella NFA, the at least one walk direction includes a forward walk direction and a reverse walk direction, the lag NFA having the forward walk direction, the umbrella NFA having the reverse walk direction, the lag portion of the at least one pattern used for generating the lag NFA and the entire portion of the at least one pattern used for generating the umbrella NFA.
-
-
15. The security appliance of claim 14, wherein the unified DFA includes:
a DFA node associated with the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the lag NFA, an instruction to transition to walk the lag NFA in the forward walk direction, the starting node of the lag NFA associated with a first element of the lag portion, and to report a match of the subpattern selected and an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected if the length is fixed.
-
16. The security appliance of claim 14, wherein the at least one NFA includes:
-
a lag node, associated with the last element of the at least one pattern, associated with metadata indicating to the at least one processor, a pointer to a starting node of the umbrella NFA, an instruction to transition to walk the umbrella NFA in the reverse walk direction, the starting node of the umbrella NFA associated with the last element of the at least one pattern, and to optionally report an offset within the payload, of a character matching the last element of the at least one pattern at the lag node, and to optionally report a match of the lag portion of the at least one pattern; and an umbrella node of the umbrella NFA, associated with the first element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to terminate the walk and to report a final match of the at least one pattern and a start offset within the payload, of a start character matching the first element of the at least one pattern at the umbrella node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
-
-
17. The security appliance of claim 1, wherein if a first element of the subpattern selected is not a first element of the at least one pattern, and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the first element of the subpattern selected is the first element of the at least one pattern, the location of the subpattern selected is a beginning-location of the at least one pattern, and if the length of the subpattern is fixed or variable:
-
the portion of the at least one pattern for generating the at least one NFA includes a lag portion and a lead portion of the at least one pattern, the lag portion of the at least one pattern is the at least one pattern excluding the lead portion of the at least one pattern, the lead portion including the first element of the at least one pattern, the last element of the subpattern selected, and all elements in the at least one pattern therebetween, the lag portion being the subpattern selected if the location of the subpattern selected is the beginning-location; and the at least one NFA includes a lag NFA and a lead NFA, the at least one walk direction includes a forward walk direction and a reverse walk direction, the lag NFA having the forward walk direction, the lead NFA having the reverse walk direction, the lag portion of the at least one pattern used for generating the lag NFA and the lead portion of the at least one pattern used for generating the lead NFA.
-
-
18. The security appliance of claim 17, wherein the unified DFA includes:
a DFA node associated with the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the lag NFA, an instruction to transition to walk the lag NFA in the forward walk direction, the starting node of the lag NFA associated with a first element of the lag portion, a pointer to a starting node of the lead NFA, an instruction to transition to walk the lead NFA in the reverse walk direction, the starting node of the lead NFA associated with a last element of the subpattern selected, and to report a match of the subpattern selected and an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected if the length is fixed.
-
19. The security appliance of claim 17, wherein the at least one NFA includes:
-
a lag node associated with the last element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to terminate walking the lag NFA, and to report a lag offset within the payload, of a lag character matching the last element of the at least one pattern at the lag node, and to report a match of the lag portion of the at least one pattern; and a lead node associated with the first element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to terminate walking the lead NFA and to report a match of the lead portion and a lead offset within the payload, of a lead character matching the first element of the at least one pattern at the lead node.
-
-
20. The security appliance of claim 1, wherein if a first element of the subpattern selected is not a first element of the at least one pattern, and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the length of the subpattern selected is fixed or variable:
the at least one NFA is a single NFA, and the at least one walk direction includes a forward walk direction, for run time processing nodes of the single NFA associated with elements of a lag portion of the at least one pattern, and a reverse walk direction, for run time processing nodes of the single NFA associated with all elements of the at least one pattern, the lag portion of the at least one pattern being the at least one pattern excluding a lead portion of the at least one pattern, the lead portion including the first element of the at least one pattern, the last element of the subpattern selected, and all elements in the at least one pattern therebetween.
-
21. The security appliance of claim 20, wherein the unified DFA includes:
a DFA node associated with the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the single NFA, an instruction to transition to walk the single NFA in the forward walk direction, the starting node associated with a next element in the at least one pattern immediately following the last element of the subpattern selected, and to report a match of the subpattern selected, an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected if the length is fixed.
-
22. The security appliance of claim 20, wherein the at least one NFA includes:
-
a lag node associated with a last element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to transition to walk the single NFA in the reverse walk direction using a payload starting offset associated with the end offset of the subpattern selected; and a lead node associated with the first element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to terminate the walk, and to report an offset within the payload, of a character matching the first element of the at least one pattern at the lead node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern, and a final match of the at least one pattern.
-
-
23. The security appliance of claim 1, wherein if a first element of the subpattern selected is not a first element of the at least one pattern, and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the length of the subpattern selected is fixed:
the at least one NFA is a single NFA, and the at least one walk direction includes a reverse walk direction, for run time processing nodes of the single NFA associated with a lead portion of the at least one pattern, and a forward walk direction, for run time processing nodes of the single NFA associated with all elements of the at least one pattern, the lead portion being the at least one pattern excluding a lag portion of the at least one pattern, the lag portion including the first element of the subpattern selected, the last element of the at least one pattern, and all elements in the at least one pattern therebetween.
-
24. The security appliance of claim 23, wherein the unified DFA includes:
a DFA node associated with the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the single NFA, an instruction to transition to walk the single NFA in the reverse walk direction, the starting node associated with a last element of the lead portion, a payload starting offset being determined by subtracting a length of the subpattern selected from the end offset of the subpattern selected, and to report a match of the subpattern selected, an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and the length of the subpattern selected.
-
25. The security appliance of claim 23, wherein the at least one NFA includes:
-
a lead node associated with a first element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to transition to walk the single NFA in the forward walk direction; and a lag node associated with the last element of the at least one pattern, associated with metadata indicating to the at least one processor, an instruction to terminate the walk, and to report an offset within the payload, of a character matching the last element of the at least one pattern at the lag node, and a final match of the at least one pattern.
-
-
26. The security appliance of claim 1, wherein if a last element of the subpattern selected is a last element of the at least one pattern, the location of the subpattern selected is an end-location of the at least one pattern, and if the length of the subpattern selected is fixed, the portion of the at least one pattern for generating the at least one NFA is the at least one pattern excluding the subpattern selected, and the at least one walk direction is a reverse walk direction.
-
27. The security appliance of claim 26, wherein the unified DFA includes:
a DFA node corresponding to the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the at least one NFA, an instruction to transition to walk the at least one NFA in a reverse walk direction, the starting node of the at least one NFA associated with a last element of the portion, and to report a match of the subpattern selected and an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, a payload starting offset of the at least one NFA determined by subtracting a length of the subpattern selected from the end offset of the subpattern selected, if the length is fixed.
-
28. The security appliance of claim 26, wherein the at least one NFA includes:
an NFA node associated with a first element of the portion, associated with metadata indicating to the at least one processor, to terminate the walk and to report a final match of the at least one pattern and an offset within the payload, of a character matching the first element of the portion at the NFA node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
-
29. The security appliance of claim 1, wherein if a last element of the subpattern selected is a last element of the at least one pattern, the location of the subpattern selected is an end-location of the at least one pattern, and if the length of the subpattern selected is variable or fixed, the portion of the at least one pattern for generating the at least one NFA is the at least one pattern, and the at least one walk direction is a reverse walk direction.
-
30. The security appliance of claim 29, wherein the unified DFA includes:
a DFA node corresponding to the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the at least one NFA, an instruction to transition to walk the at least one NFA in a reverse walk direction, the starting node of the at least one NFA associated with a last element of the subpattern selected, and to report a match of the subpattern selected and an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected if the length is fixed, a payload starting offset of the at least one NFA being associated with the end offset of the subpattern selected.
-
31. The security appliance of claim 29, wherein the at least one NFA includes:
an NFA node associated with a first element of the portion, associated with metadata indicating to the at least one processor, to terminate the walk and to report a final match of the at least one pattern and an offset within the payload, of a character matching the first element of the portion at the NFA node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
-
32. The security appliance of claim 1, wherein the unified DFA and the at least one NFA are stored as a binary image including the unified DFA and the at least one NFA.
-
33. The security appliance of claim 1, wherein the at least one processor includes a DFA co-processor and an NFA co-processor configured as an acceleration unit to offload DFA and NFA run time processing, respectively.
-
2. The security appliance of claim 1, wherein the at least one processor is further configured to report a match of the at least one pattern in the payload based on traversing an NFA node, of the at least one NFA, associated with metadata indicating a final match of the at least one pattern.
-
-
34. A method comprising:
in at least one processor operatively coupled to at least one memory in a security appliance operatively coupled to a network; walking characters of a payload in an input stream through a unified deterministic finite automata (DFA) stored in the at least one memory, by traversing nodes of the unified DFA with characters from the payload, the unified DFA generated from subpatterns selected from each pattern in a set of one or more regular expression patterns based on at least one heuristic; and walking characters of the payload through at least one non-deterministic finite automata (NFA) stored in the at least one memory, by traversing nodes of the at least one NFA with characters from the payload, the at least one NFA generated for at least one pattern in the set, a portion of the at least one pattern used for generating the at least one NFA, and at least one walk direction for walking characters through the at least one NFA, being based on whether a length of a subpattern selected from the at least one pattern is fixed or variable and a location of the subpattern selected within the at least one pattern to optimize performance of run time processing of the at least one processor for identifying an existence of the at least one pattern in the input stream. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66)
-
35. The method of claim 34, further including reporting a match of the at least one pattern in the payload based on traversing an NFA node, of the at least one NFA, associated with metadata indicating a final match of the at least one pattern.
-
36. The method of claim 34, further including:
-
associating a transaction identifier for a given walk of the DFA and the at least one NFA for matching the at least one pattern in the payload; and reporting a match of the at least one pattern in the payload based on; traversing a DFA node of the unified DFA having metadata indicating a DFA partial match of the at least one pattern; subsequently traversing at least one NFA node of the at least one NFA having metadata indicating an NFA partial match of the at least one pattern; and correlating the traversing and the subsequent traversing with the transaction identifier.
-
-
37. The method of claim 34, further including reporting an offset, of a character in the payload matching a first element of the at least one pattern, as a start offset for the at least one pattern in the payload, based on:
-
metadata associated with an NFA node of the at least one NFA and indicating a final match for the at least one pattern in the payload; and metadata associated with a DFA node of the unified DFA and indicating (i) a length, of the subpattern selected for the at least one pattern, and (ii) a subpattern end offset, of a subpattern character in the payload matching a last element of the subpattern selected for the at least one pattern, at the DFA node, the start offset being determined by the at least one processor based on subtracting the length from the subpattern end offset.
-
-
38. The method of claim 34, further including reporting an offset, of a character in the payload matching a first element of the at least one pattern, at an NFA node of the at least one NFA, as a start offset for the at least one pattern in the payload, based on correlating partial match results indicated in metadata associated with nodes of the unified DFA and the at least one NFA for the at least one pattern.
-
39. The method of claim 34, further including reporting an offset, of a character in the payload matching a first element of the at least one pattern, at an NFA node of the at least one NFA, as a start offset for the at least one pattern in the payload, based on metadata associated with the NFA node and a final match determined for the at least one pattern in the payload at the NFA node.
-
40. The method of claim 34, wherein the at least one heuristic includes maximizing a number of unique subpatterns selected and length of each subpattern selected, the length of each subpattern selected having at least a minimum threshold length.
-
41. The method of claim 34, wherein if a first element of the subpattern selected is a first element of the at least one pattern and the length of the subpattern selected is fixed, the location of the subpattern selected is a beginning-location of the at least one pattern, the portion of the at least one pattern used for generating the at least one NFA is the at least one pattern excluding the subpattern selected, the at least one NFA is a single NFA, and the at least one walk direction of the at least one NFA is a forward walk direction.
-
42. The method of claim 41, further including, at a DFA node of the unified DFA, associated with the last element of the subpattern selected and metadata indicating to the at least one processor a pointer to a starting node of the at least one NFA:
-
transitioning to walk the at least one NFA in a forward walk direction, the starting node of the at least one NFA associated with a first element of the portion of the at least one pattern used for generating the at least one NFA, a payload starting offset of the at least one NFA associated with an offset of a byte subsequent to another byte at the end offset of the subpattern selected; and reporting a match of the subpattern selected, a lead offset within the payload, of a lead character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected.
-
-
43. The method of claim 41, further including, at an NFA node of the at least one NFA, associated with metadata:
-
terminating the walk, the NFA node associated with a last element of the at least one pattern; and reporting a lag offset within the payload, of a lag character matching at the NFA node, as an end offset of the at least one pattern and a final match of the at least one pattern.
-
-
44. The method of claim 34, wherein if a first element of the subpattern selected is not a first element of the at least one pattern and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the length of the subpattern selected is fixed:
-
the portion of the at least one pattern for generating the at least one NFA includes a lag portion and a lead portion of the at least one pattern, the lag portion of the at least one pattern is the at least one pattern excluding the subpattern selected and the lead portion of the at least one pattern, the lead portion of the at least one pattern excludes the subpattern selected and the lag portion of the at least one pattern; and the at least one NFA includes a lag NFA and a lead NFA, the at least one walk direction includes a forward walk direction and a reverse walk direction, the lag NFA having the forward walk direction, the lead NFA having the reverse walk direction, the lag portion of the at least one pattern used for generating the lag NFA and the lead portion of the at least one pattern used for generating the lead NFA.
-
-
45. The method of claim 34, further including, at a DFA node of the unified DFA, associated with the last element of the subpattern selected and metadata indicating to the at least one processor, a pointer to a starting node of the lag NFA and a pointer to a starting node of the lead NFA:
-
transitioning walking of the unified DFA to walking the lag NFA in the forward walk direction, the starting node of the lag NFA associated with a first element of the lag portion; transitioning walking the lag NFA to walking the lead NFA in the reverse walk direction, the starting node of the lead NFA associated with a last element of the lead portion; and reporting an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, a match of the subpattern selected, and a length of the subpattern selected.
-
-
46. The method of claim 34, wherein the method further includes:
-
at a lag node of the lag NFA, associated with the last element of the at least one pattern, associated with metadata; terminating walking the lag NFA; and reporting a lag offset within the payload, of a lag character of the payload matching the last element at the lag node, and a match of the lag portion of the at least one pattern; and at a lead node of the lead NFA, associated with the first element of the at least one pattern, associated with metadata; terminating walking the lead NFA; and reporting a match of the lead portion of the at least one pattern and a lead offset within the payload, of a lead character of the payload matching the first element at the lead node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
-
-
47. The method of claim 34, wherein if a first element of the subpattern selected is not a first element of the at least one pattern, and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the first element of the subpattern selected is the first element of the at least one pattern, the location of the subpattern selected is the beginning-location of the at least one pattern, and if the length of the subpattern is fixed or variable:
-
the portion of the at least one pattern for generating the at least one NFA includes a lag portion and an entire portion of the at least one pattern, the lag portion of the at least one pattern is the at least one pattern excluding a lead portion of the at least one pattern, the lead portion including the first element of the at least one pattern, the last element of the subpattern selected, and all elements in the at least one pattern therebetween, the entire portion of the at least one pattern is the at least one pattern, the lead portion being the subpattern selected if the location of the subpattern selected is a beginning-location; and the at least one NFA includes a lag NFA and an umbrella NFA, the at least one walk direction includes a forward walk direction and a reverse walk direction, the lag NFA having the forward walk direction, the umbrella NFA having the reverse walk direction, the lag portion of the at least one pattern used for generating the lag NFA and the entire portion of the at least one pattern used for generating the umbrella NFA.
-
-
48. The method of claim 47, wherein the method further includes, at a DFA node of the unified DFA, associated with the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the lag NFA:
-
transitioning walking of the unified DFA to walking the lag NFA in the forward walk direction, the starting node of the lag NFA associated with a first element of the lag portion; and reporting a match of the subpattern selected and an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected if the length is fixed.
-
-
49. The method of claim 47, wherein the method further includes:
-
at a lag node of the at least one NFA, associated with the last element of the at least one pattern, associated with metadata indicating to the at least one processor, a pointer to a starting node of the umbrella NFA; transitioning walking of the lag NFA to walking the umbrella NFA in the reverse walk direction, the starting node of the umbrella NFA associated with the last element of the at least one pattern; and optionally reporting an offset within the payload, of a character matching the last element of the at least one pattern at the lag node; and optionally reporting a match of the lag portion of the at least one pattern; and at an umbrella node of the umbrella NFA, associated with the first element of the at least one pattern, associated with metadata; terminating the walk; and reporting a final match of the at least one pattern and a start offset within the payload, of a start character matching the first element of the at least one pattern at the umbrella node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
-
-
50. The method of claim 34, wherein if a first element of the subpattern selected is not a first element of the at least one pattern, and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the first element of the subpattern selected is the first element of the at least one pattern, the location of the subpattern selected is a beginning-location of the at least one pattern, and if the length of the subpattern is fixed or variable:
-
the portion of the at least one pattern for generating the at least one NFA includes a lag portion and a lead portion of the at least one pattern, the lag portion of the at least one pattern is the at least one pattern excluding the lead portion of the at least one pattern, the lead portion including the first element of the at least one pattern, the last element of the subpattern selected, and all elements in the at least one pattern therebetween, the lag portion being the subpattern selected if the location of the subpattern selected is the beginning-location; and the at least one NFA includes a lag NFA and a lead NFA, the at least one walk direction includes a forward walk direction and a reverse walk direction, the lag NFA having the forward walk direction, the lead NFA having the reverse walk direction, the lag portion of the at least one pattern used for generating the lag NFA and the lead portion of the at least one pattern used for generating the lead NFA.
-
-
51. The method of claim 34, further including, at a DFA node of the unified DFA, associated with the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the lag NFA and a pointer to a starting node of the lead NFA:
-
transitioning walking of the unified DFA to walking the lag NFA in the forward walk direction, the starting node of the lag NFA associated with a first element of the lag portion; and transitioning walking of the unified DFA to walking the lead NFA in the reverse walk direction, the starting node of the lead NFA associated with a last element of the subpattern selected; and reporting a match of the subpattern selected and an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected if the length is fixed.
-
-
52. The method of claim 34, further including:
-
at a lag node of the at least one NFA, associated with the last element of the at least one pattern, associated with metadata; terminating walking the lag NFA, and the method further includes reporting a lag offset within the payload, of a lag character matching the last element of the at least one pattern at the lag node, and reporting a match of the lag portion of the at least one pattern; and at a lead node of the at least one NFA, associated with the first element of the at least one pattern, associated with metadata; terminating walking the lead NFA; and reporting a match of the lead portion and a lead offset within the payload, of a lead character matching the first element of the at least one pattern at the lead node.
-
-
53. The method of claim 34, wherein if a first element of the subpattern selected is not a first element of the at least one pattern, and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the length of the subpattern selected is fixed or variable:
the at least one NFA is a single NFA, and the at least one walk direction includes a forward walk direction, for run time processing nodes of the single NFA associated with elements of a lag portion of the at least one pattern, and a reverse walk direction, for run time processing nodes of the single NFA associated with all elements of the at least one pattern, the lag portion of the at least one pattern being the at least one pattern excluding a lead portion of the at least one pattern, the lead portion including the first element of the at least one pattern, the last element of the subpattern selected, and all elements in the at least one pattern therebetween.
-
54. The method of claim 53, further including, at an DFA node of the unified DFA, associated with the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the single NFA:
-
transitioning walking the unified DFA to walking the single NFA in the forward walk direction, the starting node associated with a next element in the at least one pattern immediately following the last element of the subpattern selected; and reporting a match of the subpattern selected, an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected if the length is fixed.
-
-
55. The method of claim 53, further including:
-
at a lag node of the at least one NFA, associated with a last element of the at least one pattern, associated with metadata; transitioning from walking the unified DFA to walking the single NFA in the reverse walk direction using a payload starting offset associated with the end offset of the subpattern selected; and at a lead node of the at least one NFA, associated with the first element of the at least one pattern, associated with metadata; terminating the walk; and reporting an offset within the payload, of a character matching the first element of the at least one pattern at the lead node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern, and a final match of the at least one pattern.
-
-
56. The method of claim 34, wherein if a first element of the subpattern selected is not a first element of the at least one pattern, and a last element of the subpattern selected is not a last element of the at least one pattern, the location of the subpattern selected is a mid-location of the at least one pattern, and if the length of the subpattern selected is fixed:
the at least one NFA is a single NFA, and the at least one walk direction includes a reverse walk direction, for run time processing nodes of the single NFA associated with a lead portion of the at least one pattern, and a forward walk direction, for run time processing nodes of the single NFA associated with all elements of the at least one pattern, the lead portion being the at least one pattern excluding a lag portion of the at least one pattern, the lag portion including the first element of the subpattern selected, the last element of the at least one pattern, and all elements in the at least one pattern therebetween.
-
57. The method of claim 56, the method further including:
-
at a DFA node of the unified DFA, associated with the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the single NFA; transitioning walking of the unified DFA to walking the single NFA in the reverse walk direction, the starting node associated with a last element of the lead portion, a payload starting offset being determined by subtracting a length of the subpattern selected from the end offset of the subpattern selected; and reporting a match of the subpattern selected, an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and the length of the subpattern selected.
-
-
58. The method of claim 56, the method further including:
-
at a lead node of the single NFA, associated with a first element of the at least one pattern, associated with metadata; walking the single NFA in the forward walk direction; and at a lag node of the single NFA, associated with the last element of the at least one pattern, associated with metadata; terminating the walk; and reporting an offset within the payload, of a character matching the last element of the at least one pattern at the lag node, and a final match of the at least one pattern.
-
-
59. The method of claim 34, wherein if a last element of the subpattern selected is a last element of the at least one pattern, the location of the subpattern selected is an end-location of the at least one pattern, and if the length of the subpattern selected is fixed, the portion of the at least one pattern for generating the at least one NFA is the at least one pattern excluding the subpattern selected, and the at least one walk direction is a reverse walk direction.
-
60. The method of claim 59, further including:
-
at a DFA node of the unified DFA, corresponding to the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the at least one NFA; transitioning walking of the unified DFA to walking the at least one NFA in a reverse walk direction, the starting node of the at least one NFA associated with a last element of the portion; and reporting a match of the subpattern selected and an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, a payload starting offset of the at least one NFA determined by subtracting a length of the subpattern selected from the end offset of the subpattern selected, if the length is fixed.
-
-
61. The method of claim 59, further including:
at an NFA node of the at least one NFA, associated with a first element of the portion, associated with metadata; terminating the walk; and reporting a final match of the at least one pattern and an offset within the payload, of a character matching the first element of the portion at the NFA node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
-
62. The method of claim 34, wherein if a last element of the subpattern selected is a last element of the at least one pattern, the location of the subpattern selected is an end-location of the at least one pattern, and if the length of the subpattern selected is variable or fixed, the portion of the at least one pattern for generating the at least one NFA is the at least one pattern, and the at least one walk direction is a reverse walk direction.
-
63. The method of claim 62, further including:
at a DFA node of the unified DFA, corresponding to the last element of the subpattern selected, associated with metadata indicating to the at least one processor, a pointer to a starting node of the at least one NFA; transitioning walking of the unified DFA to walking the at least one NFA in a reverse walk direction, the starting node of the at least one NFA associated with a last element of the subpattern selected; and reporting a match of the subpattern selected and an offset within the payload, of a character matching the last element of the subpattern selected at the DFA node, as an end offset of the subpattern selected, and a length of the subpattern selected if the length is fixed, a payload starting offset of the at least one NFA being associated with the end offset of the subpattern selected.
-
64. The method of claim 62, further including:
at an NFA node of the at least one NFA, associated with a first element of the portion, associated with metadata; terminating the walk; and reporting a final match of the at least one pattern and an offset within the payload, of a character matching the first element of the portion at the NFA node, as a start offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
-
65. The method of claim 34, wherein the unified DFA and the at least one NFA are stored as a binary image including the unified DFA and the at least one NFA.
-
66. The method of claim 34, wherein the at least one processor includes a DFA co-processor and an NFA co-processor configured as an acceleration unit to offload DFA and NFA run time processing, respectively.
-
35. The method of claim 34, further including reporting a match of the at least one pattern in the payload based on traversing an NFA node, of the at least one NFA, associated with metadata indicating a final match of the at least one pattern.
-
67. A non-transitory computer-readable medium having stored thereon a sequence of instructions which, when loaded and executed by a processor, causes the processor to:
-
walk characters of a payload in an input stream through a unified deterministic finite automata (DFA) stored in the at least one memory, by traversing nodes of the unified DFA with characters from the payload, the unified DFA generated from subpatterns selected from each pattern in a set of one or more regular expression patterns based on at least one heuristic; and walk characters of the payload through at least one non-deterministic finite automata (NFA) stored in the at least one memory, by traversing nodes of the at least one NFA with characters from the payload, the at least one NFA generated for at least one pattern in the set, a portion of the at least one pattern used for generating the at least one NFA, and at least one walk direction for walking characters through the at least one NFA, being based on whether a length of a subpattern selected from the at least one pattern is fixed or variable and a location of the subpattern selected within the at least one pattern to optimize performance of run time processing of the processor for identifying an existence of the at least one pattern in the input stream.
-
Specification
- Resources
Thank you for your request. You will receive a custom alert email when the Litigation Campaign Assessment is available.
×
-
Current AssigneeMarvell Asia Pte Limited (Marvell Technology Group Limited)
-
Original AssigneeCavium, LLC (Marvell Technology Group Limited)
-
InventorsBilla, Satyanarayana Lakshmipathi, Goyal, Rajan
-
Primary Examiner(s)DADA, BEEMNET W
-
Application NumberUS14/015,929Publication NumberTime in Patent Office1,089 DaysField of SearchUS Class Current1/1CPC Class CodesH04L 63/0245 Filtering by information in...H04L 63/1408 by monitoring network traff...