Processing element management in a streaming data system
First Claim
1. A method, comprising:
- receiving streaming data to be processed by a plurality of processing elements comprising of one or more operators, the operators processing at least a portion of the received data by operation of one or more computer processors;
establishing an operator graph of the plurality of operators, the operator graph defining at least one execution path in which a first operator of the plurality of operators is configured to receive data tuples from at least one upstream operator and transmit data tuples to at least one downstream operator;
identifying, relative to predefined criteria, a first underutilized hardware resource in a computing system that executes the operators;
un-fusing a first operator from a first processing element of the plurality of processing elements, the first processing element comprising of a plurality of operators, wherein, before un-fusing the first operator, the first operator processes data within the first processing element;
transferring the first operator to a second processing element of the plurality of processing elements; and
after transferring the first operator, processing at least a portion of the received streaming data using the first operator, wherein the first operator processes the portion of the received streaming data using the first underutilized hardware resource.
1 Assignment
0 Petitions
Accused Products
Abstract
Stream applications may inefficiently use the hardware resources that execute the processing elements of the data stream. For example, a compute node may host four processing elements and execute each using a CPU. However, other CPUs on the compute node may sit idle. To take advantage of these available hardware resources, a stream programmer may identify one or more processing elements that may be cloned. The cloned processing elements may be used to generate a different execution path that is parallel to the execution path that includes the original processing elements. Because the cloned processing elements contain the same operators as the original processing elements, the data stream that was previously flowing through only the original processing element may be split and sent through both the original and cloned processing elements. In this manner, the parallel execution path may use underutilized hardware resources to increase the throughput of the data stream.
49 Citations
20 Claims
-
1. A method, comprising:
-
receiving streaming data to be processed by a plurality of processing elements comprising of one or more operators, the operators processing at least a portion of the received data by operation of one or more computer processors; establishing an operator graph of the plurality of operators, the operator graph defining at least one execution path in which a first operator of the plurality of operators is configured to receive data tuples from at least one upstream operator and transmit data tuples to at least one downstream operator; identifying, relative to predefined criteria, a first underutilized hardware resource in a computing system that executes the operators; un-fusing a first operator from a first processing element of the plurality of processing elements, the first processing element comprising of a plurality of operators, wherein, before un-fusing the first operator, the first operator processes data within the first processing element; transferring the first operator to a second processing element of the plurality of processing elements; and after transferring the first operator, processing at least a portion of the received streaming data using the first operator, wherein the first operator processes the portion of the received streaming data using the first underutilized hardware resource. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product comprising:
-
A non-transitory computer-readable storage medium having computer-readable program code embodied therewith, the computer-readable program code comprising computer-readable program code configured to; receive streaming data to be processed by a plurality of processing elements comprising of one or more operators, the operators processing at least a portion of the received data by operation of one or more computer processors; establish an operator graph of the plurality of operators, the operator graph defining at least one execution path in which a first operator of the plurality of operators is configured to receive data tuples from at least one upstream operator and transmit data tuples to at least one downstream operator; identify, relative to predefined criteria, a first underutilized hardware resource in a computing system that executes the operators; un-fuse a first operator from a first processing element of the plurality of processing elements, the first processing element comprising of a plurality of operators, wherein, before un-fusing the first operator, the first operator processes data within the first processing element; transfer the first operator to a second processing element of the plurality of processing elements; and
after transferring the first operator, process at least a portion of the received streaming data using the first operator, wherein the first operator processes the portion of the received streaming data using the first underutilized hardware resource. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system, comprising:
-
a computer processor; and a memory containing a program that, when executed on the computer processor, performs an operation for processing data, comprising; receiving streaming data to be processed by a plurality of processing elements comprising of one or more operators, the operators processing at least a portion of the received data by operation of one or more computer processors; establishing an operator graph of the plurality of operators, the operator graph defining at least one execution path in which a first operator of the plurality of operators is configured to receive data tuples from at least one upstream operator and transmit data tuples to at least one downstream operator; identifying, relative to predefined criteria, a first underutilized hardware resource in a computing system that executes the operators; un-fusing a first operator from a first processing element of the plurality of processing elements, the first processing element comprising of a plurality of operators, wherein, before un-fusing the first operator, the first operator processes data within the first processing element; transferring the first operator to a second processing element of the plurality of processing elements; and after transferring the first operator, processing at least a portion of the received streaming data using the first operator, wherein the first operator processes the portion of the received streaming data using the first underutilized hardware resource. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification