Maintaining throughput of a stream processing framework while increasing processing load
First Claim
1. A method of maintaining throughput of a stream processing framework while increasing processing load, the method including:
- defining a container over at least one worker node that has a plurality of workers, with one worker utilizing a whole processor core within a worker node;
queuing data from one or more incoming data streams into multiple pipelines that run in the container and have connections to at least one common resource external to the container;
concurrently executing the pipelines at a number of workers as batches; and
limiting simultaneous connections between the common resource and the workers by providing a shared connection used to process a set of batches running on a same worker regardless of the pipelines to which the batches in the set belong.
1 Assignment
0 Petitions
Accused Products
Abstract
The technology disclosed relates to maintaining throughput of a stream processing framework while increasing processing load. In particular, it relates to defining a container over at least one worker node that has a plurality workers, with one worker utilizing a whole core within a worker node, and queuing data from one or more incoming near real-time (NRT) data streams in multiple pipelines that run in the container and have connections to at least one common resource external to the container. It further relates to concurrently executing the pipelines at a number of workers as batches, and limiting simultaneous connections to the common resource to the number of workers by providing a shared connection to a set of batches running on a same worker regardless of the pipelines to which the batches in the set belong.
206 Citations
20 Claims
-
1. A method of maintaining throughput of a stream processing framework while increasing processing load, the method including:
-
defining a container over at least one worker node that has a plurality of workers, with one worker utilizing a whole processor core within a worker node; queuing data from one or more incoming data streams into multiple pipelines that run in the container and have connections to at least one common resource external to the container; concurrently executing the pipelines at a number of workers as batches; and limiting simultaneous connections between the common resource and the workers by providing a shared connection used to process a set of batches running on a same worker regardless of the pipelines to which the batches in the set belong. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system including one or more processors coupled to memory, the memory loaded with computer instructions to maintain throughput of a stream processing framework while increasing processing load, the instructions, when executed on the processors, implement actions comprising:
-
defining a container over at least one worker node that has a plurality of workers, with one worker utilizing a whole processor core within a worker node; queuing data from one or more incoming data streams into multiple pipelines that run in the container and have connections to at least one common resource external to the container; concurrently executing the pipelines at a number of workers as batches; and limiting simultaneous connections between the common resource and the workers by providing a shared connection used to process a set of batches running on a same worker regardless of the pipelines to which the batches in the set belong. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer readable storage medium impressed with computer program instructions to maintain throughput of a stream processing framework while increasing processing load, the instructions, when executed on a processor, implement a method comprising:
-
defining a container over at least one worker node that has a plurality of workers, with one worker utilizing a whole processor core within a worker node; queuing data from one or more incoming data streams into multiple pipelines that run in the container and have connections to at least one common resource external to the container; concurrently executing the pipelines at a number of workers as batches; and limiting simultaneous connections between the common resource and the workers by providing a shared connection used to process a set of batches running on a same worker regardless of the pipelines to which the batches in the set belong. - View Dependent Claims (18, 19, 20)
-
Specification