×

Handling multiple task sequences in a stream processing framework

  • US 10,198,298 B2
  • Filed: 12/31/2015
  • Issued: 02/05/2019
  • Est. Priority Date: 09/16/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method of handling multiple task sequences, including long tail task sequences, the method comprising:

  • defining a container over worker nodes having up to one physical thread per processor core of a worker node;

    for multiple task sequences, queuing data from a plurality of incoming near real-time (NRT) data streams in at least one pipeline running in the container;

    determining, based on analysis of a given NRT data stream for processing by at least one task sequence of the multiple task sequences, that a data volume for the at least one task sequence is expected to decrease after a surge in the data volume;

    processing data from the plurality of NRT data streams in a plurality of batches via a container-coordinator configured to control batch dispatching;

    dispatching the plurality of batches to physical threads of the worker nodes for processing, wherein the dispatching comprises;

    during execution, comparing a count of physical threads available for batch processing against a set number of logically parallel threads available for batch dispatching;

    when a count of the physical threads available for batch processing equals or exceeds the number of logically parallel threads available for batch dispatching, concurrently processing the plurality of batches at the physical threads; and

    when the count of physical threads available for batch processing is less than the number of logically parallel threads available for batch dispatching, multiplexing the plurality of batches sequentially over at least one of the physical threads available for batch processing, including processing a batch of the plurality of batches in the at least one pipeline till completion or time out before processing a next batch of the plurality of batches in the at least one pipeline; and

    assigning, via a scheduler, a priority level to at least a first pipeline and a second pipeline, wherein execution of a first number of batches of the plurality of batches in the first pipeline is performed before execution of a second number of batches of the plurality of batches in the second pipeline, according to the priority level.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×