Techniques for performing regular expression-based pattern matching in data streams
First Claim
Patent Images
1. A method of detecting a pattern in a data stream comprising events, the method comprising:
- receiving, at a processing system, including a processor, predicate information, the predicate information specifying a predicate associated with each symbol in a set of symbols, the predicate information including a first predicate associated with a first symbol in the set of symbols, and including a second predicate associated with a second symbol in the set of symbols;
receiving, at the processing system, a regular expression specifying a pattern, the regular expression comprising one or more symbols from the set of symbols;
constructing an automaton for the pattern;
upon receiving an event in the data stream;
determining, based upon the received event, states of the automaton and associated bindings, anddetermining if the pattern is matched due to the received event based upon the states and associated bindings;
in response to determining that the pattern is matched by a particular sequence of events in the data stream, continuing to determine whether subsequent events following the particular sequence of events in the data stream produce a match that is longer than a match produced by the particular sequence of events; and
outputting the match produced by the particular sequence of events only after determining that no match longer than the match produced by the particular sequence of events is possible based on the states and associated bindings.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for detecting patterns in one or more data streams. A pattern to be detected may be specified using a regular expression. Events received in a data stream are processed during runtime to detect occurrences of the specified pattern in the data stream.
-
Citations
16 Claims
-
1. A method of detecting a pattern in a data stream comprising events, the method comprising:
-
receiving, at a processing system, including a processor, predicate information, the predicate information specifying a predicate associated with each symbol in a set of symbols, the predicate information including a first predicate associated with a first symbol in the set of symbols, and including a second predicate associated with a second symbol in the set of symbols; receiving, at the processing system, a regular expression specifying a pattern, the regular expression comprising one or more symbols from the set of symbols; constructing an automaton for the pattern; upon receiving an event in the data stream; determining, based upon the received event, states of the automaton and associated bindings, and determining if the pattern is matched due to the received event based upon the states and associated bindings; in response to determining that the pattern is matched by a particular sequence of events in the data stream, continuing to determine whether subsequent events following the particular sequence of events in the data stream produce a match that is longer than a match produced by the particular sequence of events; and outputting the match produced by the particular sequence of events only after determining that no match longer than the match produced by the particular sequence of events is possible based on the states and associated bindings. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-readable non-transitory storage medium storing a plurality of instructions for controlling a processor to detect a pattern in a data stream comprising events, the plurality of instructions comprising:
-
instructions that cause the processor to receive predicate information, the predicate information specifying a predicate associated with each symbol in a set of symbols, the predicate information including a first predicate associated with a first symbol in the set of symbols, and including a second predicate associated with a second symbol in the set of symbols; instructions that cause the processor to receive a regular expression specifying a pattern, the regular expression comprising one or more symbols from the set of symbols; instructions that cause the processor to construct an automaton for the pattern; and instructions that cause the processor to, upon receiving an event in the data stream, determine, based upon the received event, states of the automaton and associated bindings, and determine if the pattern is matched due to the received event based upon the states and associated bindings; instructions that cause the processor to continue to determine, in response to determining that the pattern is matched by a particular sequence of events in the data stream, whether subsequent events following the particular sequence of events in the data stream produce a match that is longer than a match produced by the particular sequence of events; and instructions that cause the processor to output the match produced by the particular sequence of events only after determining that no match longer than the match produced by the particular sequence of events is possible based on the states and associated bindings. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A system for detecting a pattern in a data stream comprising events, the system comprising:
-
a memory storing a plurality of instructions; and a processor coupled to the memory, the processor configured to execute the plurality of instructions to; receive predicate information, the predicate information specifying a predicate associated with each symbol in a set of symbols, the predicate information including a first predicate associated with a first symbol in the set of symbols, and including a second predicate associated with a second symbol in the set of symbols; receive a regular expression specifying a pattern, the regular expression comprising one or more symbols from the set of symbols; construct an automaton for the pattern; upon receiving an event in the data stream, determine, based upon the received event, states of the automaton and associated bindings, and determine if the pattern is matched due to the received event based upon the states and associated bindings; continue to determine, in response to determining that the pattern is matched by a particular sequence of events in the data stream, whether subsequent events following the particular sequence of events in the data stream produce a match that is longer than a match produced by the particular sequence of events; and output the match produced by the particular sequence of events only after determining that no match longer than the match produced by the particular sequence of events is possible based on the states and associated bindings. - View Dependent Claims (13, 14, 15, 16)
-
Specification