Distributed event processing
First Claim
1. A data processing system comprising:
- a complex event processing engine to distribute real-time data processing operations between server computing devices, the complex-event processing engine being configured to;
load configuration data defining processing stages in a processing pipeline, the processing pipeline being configured to process at least one data record stream;
instruct a distributed assignment of computing instances across the server computing devices, each server computing device loading a computing instance into memory for execution by at least one processor, the computing instances implementing each processing stage in parallel; and
instruct data distribution between the computing instances implementing each processing stage,wherein, for a stateful processing stage, data distribution is performed based on a composite key computed according to a data processing operation performed by the stateful processing stage such that input data records from a previous processing stage having a common composite key value are sent to a common computing instance for the stateful processing stage.
2 Assignments
0 Petitions
Accused Products
Abstract
Certain examples described herein provide a data processing system and method adapted for event processing. These examples provide for distribution of data processing operations between server computing devices. In one case, a plurality of processing stages are implemented using computing instances on the server computing devices. In this case, the computing instances are assigned to the server computing devices in order to perform at least one data processing operation in parallel. Certain examples described herein then provide for the distribution of data between computing instances such that parallelism is maintained for data processing operations. In certain cases, a composite key is used. In this case, a composite key value is computed for a set of data fields associated with a data item to be processed. This key value is computed based on a data processing operation to be performed. The key value is used to route the data item to an associated computing instance implementing the data processing operation.
-
Citations
15 Claims
-
1. A data processing system comprising:
a complex event processing engine to distribute real-time data processing operations between server computing devices, the complex-event processing engine being configured to; load configuration data defining processing stages in a processing pipeline, the processing pipeline being configured to process at least one data record stream; instruct a distributed assignment of computing instances across the server computing devices, each server computing device loading a computing instance into memory for execution by at least one processor, the computing instances implementing each processing stage in parallel; and instruct data distribution between the computing instances implementing each processing stage, wherein, for a stateful processing stage, data distribution is performed based on a composite key computed according to a data processing operation performed by the stateful processing stage such that input data records from a previous processing stage having a common composite key value are sent to a common computing instance for the stateful processing stage. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
8. A distributed data processing method comprising:
-
obtaining a data item output by a first computing instance, the first computing instance forming part of a plurality of computing instances configured to implement in parallel a first processing stage in a plurality of interconnected processing stages, the data item being associated with an event occurring at a given time; determining, from data defining the plurality of interconnected processing stages, a second processing stage in the plurality of interconnected processing stages configured to receive data from the first processing stage; determining whether the second processing stage is configured to process each data item independently of other data items; and responsive to a determination that the second processing stage is not configured to process each data item independently of other data items; computing a composite key value from fields associated with the obtained data item; determining a second computing instance corresponding to the computed composite key value from a plurality of computing instances implementing the second processing stage in parallel; and sending the data item output by the first computing instance to the determined second computing instance. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory machine readable medium comprising instructions which, when loaded into memory and executed by at least one processor of a complex event processing engine, cause the at least one processor to:
-
retrieve a configuration file defining an event processing pipeline, the event processing pipeline comprising a plurality of event processing operations and a plurality of connections coupling said operations indicating a logical sequence for the event processing pipeline; initialize a plurality of computing instances for each event processing operation across a cluster of server computing devices, the plurality of computing instances being configured to perform the event processing operation in parallel; obtain events from at least one real-time event stream, each event comprising at !east one data field; and distribute the obtained events between the plurality of computing instances, wherein each computing instance is configured to receive input data associated with an event and to output data following performance of the event processing operation, and wherein, for an event processing operation based on a relationship between events the instructions cause the at least one processor to; route events from computing instances of a previous event processing operation to computing instances of a subsequent event processing operation based on a compound index defined based on the subsequent event processing operation, such that events with the same compound index are sent to the same computing instance.
-
Specification