MANAGING RESOURCE ALLOCATION IN A STREAM PROCESSING FRAMEWORK
First Claim
1. A method of managing resource allocation to task sequences that have long tails, the method including:
- operating a computing grid that includes machine resources, with heterogeneous containers defined over whole machines and some containers including multiple machines;
initially allocating multiple machines to a first container;
initially allocating first set of stateful task sequences to the first container;
running the first set of stateful task sequences as multiplexed units of work in the first container under control of a container-scheduler, wherein each unit of work for a first task sequence runs to completion on first machine resources in the first container, unless it overruns a time-out, before a next unit of work for a second task sequence runs multiplexed on the first machine resources;
detecting that at least one long tail task sequence is consuming measurably fewer resources than initially allocated; and
responsive to the detecting, automatically allocating one or more additional stateful task sequences to the first container or deallocating one or more machines from the first container.
1 Assignment
0 Petitions
Accused Products
Abstract
The technology disclosed relates to managing resource allocation to task sequences in a stream processing framework. In particular, it relates to operating a computing grid that includes machine resources, with heterogeneous containers defined over whole machines and some containers including multiple machines. It also includes initially allocating multiple machines to a first container, initially allocating first set of stateful task sequences to the first container, running the first set of stateful task sequences as multiplexed units of work under control of a container-scheduler, where each unit of work for a first task sequence runs to completion on first machine resources in the first container, unless it overruns a time-out, before a next unit of work for a second task sequence runs multiplexed on the first machine resources. It further includes automatically modifying a number of machine resources and/or a number assigned task sequences to a container.
40 Citations
20 Claims
-
1. A method of managing resource allocation to task sequences that have long tails, the method including:
-
operating a computing grid that includes machine resources, with heterogeneous containers defined over whole machines and some containers including multiple machines; initially allocating multiple machines to a first container; initially allocating first set of stateful task sequences to the first container; running the first set of stateful task sequences as multiplexed units of work in the first container under control of a container-scheduler, wherein each unit of work for a first task sequence runs to completion on first machine resources in the first container, unless it overruns a time-out, before a next unit of work for a second task sequence runs multiplexed on the first machine resources; detecting that at least one long tail task sequence is consuming measurably fewer resources than initially allocated; and responsive to the detecting, automatically allocating one or more additional stateful task sequences to the first container or deallocating one or more machines from the first container. - View Dependent Claims (2, 3, 4, 18)
-
-
5. A method of managing resource allocation to surging task sequences, the method including:
-
operating a computing grid that includes machine resources, with heterogeneous containers defined over whole machines and some containers including multiple machines; initially allocating multiple machines to a first container; initially allocating first set of stateful task sequences to the first container; running the first set of stateful task sequences as multiplexed units of work in the first container under control of a container-scheduler, wherein each unit of work for a first task sequence runs to completion on first machine resources in the first container, unless it overruns a time-out, before a next unit of work for a second task sequence runs multiplexed on the first machine resources; detecting that at least one task sequence is requiring measurably more resources than initially allocated; determining that the multiple machines allocated to the first container have not yet reached a predetermined maximum; and automatically allocating more machines to the first container or reallocating some task sequences in the first set of task sequences from the first container to a second container. - View Dependent Claims (6, 7, 8, 9, 19)
-
-
10. A method of managing resource allocation to faulty task sequences, the method including:
-
initially allocating first set of stateful task sequences to a first container; receiving input from a replayable input source and triggering stateful first set of tasks sequences to process the input; running the first set of stateful task sequences as multiplexed units of work in the first container under control of a container-scheduler, where each unit of work for a first task sequence runs to completion on first machine resources in the first container, unless it overruns a time-out, before a next unit of work for a second task sequence runs multiplexed on the first machine resources; during running, persisting state information of the first set of task sequences; detecting runtime of a unit of work in a faulty task sequence exceeding a predetermined timeout threshold; restarting the faulty task sequence by automatically reloading persisted state information of the faulty task sequence; automatically rewinding a replayable input to the faulty task sequence to a point preceding the detecting and synchronized with the persisted state information for the faulty task sequence; and rerunning the faulty task sequence to completion of the unit of work without exceeding the predetermined timeout threshold. - View Dependent Claims (11, 20)
-
-
12. A system including one or more processors coupled to memory, the memory loaded with computer instructions to manage resource allocation to task sequences that have long tails, the instructions, when executed on the processors, implement actions comprising:
-
operating a computing grid that includes machine resources, with heterogeneous containers defined over whole machines and some containers including multiple machines; initially allocating multiple machines to a first container; initially allocating first set of stateful task sequences to the first container; running the first set of stateful task sequences as multiplexed units of work in the first container under control of a container-scheduler, where each unit of work for a first task sequence runs to completion on first machine resources in the first container, unless it overruns a time-out, before a next unit of work for a second task sequence runs multiplexed on the first machine resources; detecting that at least one long tail task sequence is consuming measurably fewer resources than initially allocated; and responsive to the detecting, automatically allocating one or more additional stateful task sequences to the first container or deallocating one or more machines from the first container. - View Dependent Claims (13, 14, 15, 16, 17)
-
Specification