Deterministic finite automata (DFA) graph compression
First Claim
1. A computer implemented method for generating a deterministic finite automata-based graph comprising:
- storing a graph of expressions in memory, the graph including a plurality of nodes, each node having arcs, each arc being for a character and connecting a node to a next node;
during compilation of the graph;
identifying at least one node of the plurality of nodes in the graph as a designated node;
for each node of the plurality of nodes, except the designated node, comparing each arc of a subject node having a plurality of arcs with each arc of the designated node to find an arc of the subject node that is for an identical character and that connects the subject node to an identical next node in the graph as an arc of the designated node, the arc found being called a non-unique arc; and
removing the non-unique arc from the subject node without reducing the number of nodes in the plurality of nodes.
7 Assignments
0 Petitions
Accused Products
Abstract
An apparatus, and corresponding method, for generating a graph used in performing a search for a match of at least one expression in an input stream is presented. The graph includes a number of interconnected nodes connected solely by valid arcs. A valid arc of a current node represents a character match in an expression of a character associated with the current node. Arcs which are not valid may be pruned. Non-valid arcs may include arcs which point back to a designated node(s), or arcs that point to the same next node as the designated node(s) for the same character. Typically, the majority of arcs associated with a node are non-valid. Therefore, pruning the non-valid arcs may greatly reduce graph storage requirements.
121 Citations
17 Claims
-
1. A computer implemented method for generating a deterministic finite automata-based graph comprising:
-
storing a graph of expressions in memory, the graph including a plurality of nodes, each node having arcs, each arc being for a character and connecting a node to a next node; during compilation of the graph; identifying at least one node of the plurality of nodes in the graph as a designated node; for each node of the plurality of nodes, except the designated node, comparing each arc of a subject node having a plurality of arcs with each arc of the designated node to find an arc of the subject node that is for an identical character and that connects the subject node to an identical next node in the graph as an arc of the designated node, the arc found being called a non-unique arc; and removing the non-unique arc from the subject node without reducing the number of nodes in the plurality of nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus comprising:
-
a memory unit configured to store a graph of expressions including a plurality of nodes, each node having arcs, each arc being for a character and connecting a node to a next node; and a processor configured to; during compilation of the graph; identify at least one node of the plurality of nodes in the graph as a designated node; for each node of the plurality of nodes, except the designated node, compare each arc of a subject node having a plurality of arcs with each arc of the designated node to find an arc of the subject node that is for an identical character and that connects the subject node to an identical next node in the graph as an arc of the designated node, the arc found being called a non-unique arc; and remove the non-unique arc from the subject node without reducing the number of nodes in the plurality of nodes. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer usable medium storing instructions for generating a deterministic finite automata-based graph that, when executed by a computer, cause the computer to:
-
store a graph of expressions in memory, the graph including a plurality of nodes, each node having arcs, each arc being for a character and connecting a node to a next node; during compilation of the graph; identify at least one node of the plurality of nodes in the graph as a designated node; for each node of the plurality of nodes, except the designated node, compare each arc of a subject node having a plurality of arcs with each arc of the designated node to find an arc among the plurality of arcs of the subject node that is for an identical character and that connects the subject node to an identical next node in the graph as an arc of the designated node, the arc found being called a non-unique arc; and remove the non-unique arc from the subject node without reducing the number of nodes in the plurality of nodes.
-
Specification