Smart tuple resource estimation
First Claim
1. A method for processing a stream of tuples, the method comprising:
- receiving, by a stream application, a stream of tuples to be processed by a plurality of processing elements operating on one or more compute nodes, each processing element having one or more stream operators;
assigning, by the stream application, one or more processing cycles to a plurality of segments of software code embedded in a tuple of the stream of tuples, the segments of software code embedded in the tuple configured to update the logic of the plurality of processing elements of the stream application, wherein the processing cycles embedded in the tuple change the logic of the stream application by bypassing one or more processing elements and/or stream operators; and
executing, by the software-embedded tuple and on one or more tuples of the stream tuples not updated by one or the one or more stream operators, the following operations;
retrieving, by the software-embedded tuple, one or more compute node metrics that describe one or more resources of a first compute node;
obtaining, by the software-embedded tuple, tuple information of one or more tuples of the stream of tuples to be processed by a first stream operator, the first stream operator operating on the one or more resources;
determining, by the software-embedded tuple and based on the obtained tuple information and based on the compute node metrics, a prospective resource disparity related to the first stream operator; and
transmitting, by the software-embedded tuple to the stream application and based on the determined prospective resource disparity, a resource request related to the one or more resources.
1 Assignment
0 Petitions
Accused Products
Abstract
A stream application receives a stream of tuples to be processed by a plurality of processing elements. The plurality of processing elements operate on one or more compute nodes. Each processing element has one or more stream operators. Segments of software code are embedded in a tuple of the stream of tuples. The tuple retrieves one or more compute node metrics. The compute node metrics describe one or more resources of a first compute node. The tuple obtains tuple information of one or more tuples of the stream of tuples to be processed by a first stream operator that operates on the one or more resources. The tuple determines a prospective resource disparity related to the first stream operator based on the obtained tuple information and the compute node metrics. The tuple transmits a resource request to the stream application based on the determined prospective resource disparity.
69 Citations
20 Claims
-
1. A method for processing a stream of tuples, the method comprising:
-
receiving, by a stream application, a stream of tuples to be processed by a plurality of processing elements operating on one or more compute nodes, each processing element having one or more stream operators; assigning, by the stream application, one or more processing cycles to a plurality of segments of software code embedded in a tuple of the stream of tuples, the segments of software code embedded in the tuple configured to update the logic of the plurality of processing elements of the stream application, wherein the processing cycles embedded in the tuple change the logic of the stream application by bypassing one or more processing elements and/or stream operators; and executing, by the software-embedded tuple and on one or more tuples of the stream tuples not updated by one or the one or more stream operators, the following operations; retrieving, by the software-embedded tuple, one or more compute node metrics that describe one or more resources of a first compute node; obtaining, by the software-embedded tuple, tuple information of one or more tuples of the stream of tuples to be processed by a first stream operator, the first stream operator operating on the one or more resources; determining, by the software-embedded tuple and based on the obtained tuple information and based on the compute node metrics, a prospective resource disparity related to the first stream operator; and transmitting, by the software-embedded tuple to the stream application and based on the determined prospective resource disparity, a resource request related to the one or more resources. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for processing a stream of tuples comprising:
-
a plurality of processing elements configured to receive a stream of tuples, each processing element having one or more stream operators; two or more processors; and a memory containing an application that, when executed, causes at least one of the two or more processors to perform a method comprising; embedding, by a first processor, a tuple of the stream of tuples with a plurality of segments of software code, wherein the embedded plurality of segments of software code change the logic of the stream application by bypassing one or more processing elements and/or stream operators; retrieving, by a second processor and based on the embedded plurality of segments of software code, one or more compute node metrics that describe one or more resources of a first compute node; obtaining, by the second processor and based on the embedded plurality of segments of software code, tuple information of one or more tuples of the stream of tuples to be processed by a first stream operator, the first stream operator operating on the one or more resources; determining, by the second processor and based on the obtained tuple information and based on the compute node metrics, a prospective resource disparity related to the first stream operator; and transmitting, by the second processor to the stream application and based on the determined prospective resource disparity, a resource request related to the one or more resources. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A computer program product for processing a stream of tuples, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a plurality of processing elements operating on one or more compute nodes, each processing element having one or more stream operators, the program instructions to perform a method comprising:
-
embedding, by a first compute node, a tuple of the stream of tuples with a plurality of segments of software code; assigning an additional computing system to execute the plurality of segments of software code, wherein the additional computing system is separate from any compute node that executes the plurality of processing elements, and wherein the additional computing is separate from any compute node that executes the one or more stream operators; retrieving, by the additional computing system and based on the plurality of segments of software code, one or more compute node metrics that describe one or more resources of a first compute node; obtaining, by the additional computing system and based on the plurality of segments of software code, tuple information of one or more tuples of the stream of tuples to be processed by a first stream operator, the first stream operator operating on the one or more resources; obtaining, by the software-embedded tuple, tuple information of the one or more tuples of the stream of tuples to be processed by a second stream operator, the second stream operator operating on the one or more tuples before the first stream operator; determining, by the additional computing system and based on the plurality of segments of software code and based on the obtained tuple information and based on the compute node metrics, a prospective resource disparity related to the first stream operator, wherein the prospective resource disparity is based on the number of tuples that remain or are culled after being processed by the second stream operator; and transmitting, by the additional computing system and based on the plurality of segments of software code and to the stream application and based on the determined prospective resource disparity, a resource request related to the one or more resources. - View Dependent Claims (19, 20)
-
Specification