Providing strong ordering in multi-stage streaming processing
First Claim
1. A method of providing strong ordering in multi-stage processing of data streams, the method including:
- receiving, by a grid coordinator operating a computing grid that includes a plurality of physical threads which process data from one or more data streams in batches, current batch-stage information from a grid-scheduler comprising current-batch units and downstream batch-units that depend on completion of the current-batch units;
determining, for a current batch-stage identified in the current batch-stage information a batch-unit pending dispatch from the downstream batch-units;
identifying one or more physical threads that processed batch-units for the current batch-stage on which the batch unit pending dispatch depends and have registered pending tasks for the current batch-stage; and
dispatching the batch unit pending dispatch to the one or more identified physical threads subsequent to complete processing of the batch-units for the current batch-stage.
1 Assignment
0 Petitions
Accused Products
Abstract
The technology disclosed relates to providing strong ordering in multi-stage processing of near real-time (NRT) data streams. In particular, it relates to maintaining current batch-stage information for a batch at a grid-scheduler in communication with a grid-coordinator that controls dispatch of batch-units to the physical threads for a batch-stage. This includes operating a computing grid, and queuing data from the NRT data streams as batches in pipelines for processing over multiple stages in the computing grid. Also included is determining, for a current batch-stage, batch-units pending dispatch, in response to receiving the current batch-stage information; identifying physical threads that processed batch-units for a previous batch-stage on which the current batch-stage depends and have registered pending tasks for the current batch-stage; and dispatching the batch-units for the current batch-stage to the identified physical threads subsequent to complete processing of the batch-units for the previous batch-stage.
198 Citations
20 Claims
-
1. A method of providing strong ordering in multi-stage processing of data streams, the method including:
-
receiving, by a grid coordinator operating a computing grid that includes a plurality of physical threads which process data from one or more data streams in batches, current batch-stage information from a grid-scheduler comprising current-batch units and downstream batch-units that depend on completion of the current-batch units; determining, for a current batch-stage identified in the current batch-stage information a batch-unit pending dispatch from the downstream batch-units; identifying one or more physical threads that processed batch-units for the current batch-stage on which the batch unit pending dispatch depends and have registered pending tasks for the current batch-stage; and dispatching the batch unit pending dispatch to the one or more identified physical threads subsequent to complete processing of the batch-units for the current batch-stage. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system including one or more processors coupled to memory, the memory loaded with computer instructions to provide strong ordering in multi-stage processing of data streams, the instructions, when executed on the processors, implement actions comprising:
-
receiving, by a grid coordinator operating a computing grid that includes a plurality of physical threads which process data from one or more data streams in batches, current batch-stage information from a grid-scheduler comprising current-batch units, and downstream batch-units that depend on completion of the current-batch units; determining, for a current batch-stage identified in the current batch-stage information a batch-unit pending dispatch from the downstream batch-units; identifying one or more physical threads that processed batch-units for the current a previous on which the batch-unit pending dispatch depends and have registered pending tasks for the current batch-stage; and dispatching the batch unit pending dispatch to the one or more identified physical threads subsequent to complete processing of the batch-units for the current batch-stage. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer readable storage medium impressed with computer program instructions to provide strong ordering in multi-stage processing of data streams, the instructions, when executed on a processor, implement a method comprising:
-
receiving, by a grid coordinator operating a computing grid that includes a plurality of physical threads which process data from one or more data streams in batches, current batch-stage information from a grid-scheduler comprising current-batch units and downstream batch-units that depend on completion of the current-batch units; determining, for a current batch-stage identified in the current batch-stage information a batch-unit pending dispatch from the downstream batch-units; identifying one or more physical threads that processed batch-units for the current batch-stage on which the batch unit pending dispatch depends and have registered pending tasks for the current batch-stage; and dispatching the batch unit pending dispatch to the one or more identified physical threads subsequent to complete processing of the batch-units for the current batch-stage. - View Dependent Claims (18, 19, 20)
-
Specification