×

System and method for determining the start of a match of a regular expression

  • US 7,305,391 B2
  • Filed: 02/06/2004
  • Issued: 12/04/2007
  • Est. Priority Date: 02/07/2003
  • Status: Active Grant
First Claim
Patent Images

1. A method for determining the start states of each rule of a plurality of rules, each rule being utilized for pattern recognition of an input character string, and generating a multi-rule deterministic finite state automaton (DFA), which comprises the steps of:

  • prepending to each rule the metacharacters “

    .*” and

    transforming each rule prepended with the metacharacters to a non-deterministic finite state automaton (NFA) using a Thompson Construction;

    analyzing the NFA for each rule to determine NFA start states by the following substeps;

    a) producing an epsilon closure starting at the initial state of the NFA;

    b) producing a 1-closure of the initial epsilon closure;

    c) comparing the states in the initial epsilon closure with the states in the 1 closure; and

    d) determining as NFA start states all states in the 1-closure which are not in the initial epsilon closure;

    converting the NFA for each rule into a DFA using an NFA to DFA algorithm, thereby creating a DFA state from one or more NFA states;

    for each DFA state that contains an NFA start state, determining the distance of the DFA state from the global start of the DFA for each rule;

    comparing the distances of the DFA start states that contain an NFA start state from the global start state;

    choosing as a DFA start state the DFA state containing an NFA start state which is closest to the global start state;

    if more than one DFA state containing an NFA start state have the same closest distance to the global start state, accepting as DFA start states each of said closest DFA states having the same closest distance to the global start state;

    creating a new NFA start state and inserting an epsilon transition from the new NFA start state to each of the DFA'"'"'s for each rule, thereby creating a meta-NFA;

    converting the meta-NFA to a final multi-rule DFA storing said multi-rule DFA in a memory; and

    applying said final multi-rule DFA to conduct pattern matching on said input character string.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×