Streams optional execution paths depending upon data rates
First Claim
1. A system, comprising:
- a computer processor; and
a memory containing a program that, when executed on the computer processor, performs an operation for processing data, comprising;
receiving streaming data to be processed by a plurality of interconnected processing elements, each processing element comprising one or more operators that process at least a portion of the received data by operation of one or more computer processors, wherein each one of the plurality of interconnected processing elements is hosted on a corresponding compute node;
measuring, during a first time period, a data flow rate in a data path between at least two operators in the plurality of processing elements processing the streaming data;
processing, during the first time period, at least a portion of the streaming data using a first code module, wherein the streaming data comprises a plurality of data tuples where each of the plurality of data tuples comprises a plurality of attribute value pairs, wherein the first code module processes a first attribute value pair of the plurality of attribute value pairs;
selecting, based on the measured data flow rate, an inactive code module stored in a first one of the plurality of processing elements processing the streaming data, wherein the selected code module is maintained in an inactive state until the data flow rate satisfies a predefined threshold; and
activating, during a second time period, the selected code module on the first plurality of processing elements such that a second attribute value pair of the plurality of attribute value pairs in the streaming data received by the first processing element is processed by the selected code module, wherein the second time period occurs after the first time period, wherein the first code module processes the first attribute value pair during the second time period, and wherein the first attribute value pair is different from the second attribute value pair.
1 Assignment
0 Petitions
Accused Products
Abstract
Processing elements in a streaming application may contain one or more optional code modules—i.e., computer-executable code that is executed only if one or more conditions are met. In one embodiment, an optional code module is executed based on evaluating data flow rate between components in the streaming application. As an example, the stream computing application may monitor the incoming data rate between processing elements and select which optional code module to execute based on this rate. For example, if the data rate is high, the stream computing application may choose an optional code module that takes less time to execute. Alternatively, a high data rate may indicate that the incoming data is important; thus, the streaming application may choose an optional code module containing a more rigorous data processing algorithm, even if this algorithm takes more time to execute.
-
Citations
16 Claims
-
1. A system, comprising:
-
a computer processor; and a memory containing a program that, when executed on the computer processor, performs an operation for processing data, comprising; receiving streaming data to be processed by a plurality of interconnected processing elements, each processing element comprising one or more operators that process at least a portion of the received data by operation of one or more computer processors, wherein each one of the plurality of interconnected processing elements is hosted on a corresponding compute node; measuring, during a first time period, a data flow rate in a data path between at least two operators in the plurality of processing elements processing the streaming data; processing, during the first time period, at least a portion of the streaming data using a first code module, wherein the streaming data comprises a plurality of data tuples where each of the plurality of data tuples comprises a plurality of attribute value pairs, wherein the first code module processes a first attribute value pair of the plurality of attribute value pairs; selecting, based on the measured data flow rate, an inactive code module stored in a first one of the plurality of processing elements processing the streaming data, wherein the selected code module is maintained in an inactive state until the data flow rate satisfies a predefined threshold; and activating, during a second time period, the selected code module on the first plurality of processing elements such that a second attribute value pair of the plurality of attribute value pairs in the streaming data received by the first processing element is processed by the selected code module, wherein the second time period occurs after the first time period, wherein the first code module processes the first attribute value pair during the second time period, and wherein the first attribute value pair is different from the second attribute value pair. - View Dependent Claims (2, 3, 4, 5, 6, 7, 15)
-
-
8. A computer program product for processing data, the computer program product comprising:
a non-transitory computer-readable storage medium having computer-readable program code embodied therewith, the computer-readable program code comprising computer-readable program code configured to; receive streaming data to be processed by a plurality of interconnected processing elements, each processing element comprising one or more operators that process at least a portion of the received data by operation of one or more computer processors, wherein each one of the plurality of interconnected processing elements is hosted on a corresponding compute node; measure, during a first time period, a data flow rate in a data path between at least two operators in the plurality of processing elements processing the streaming data; process, during the first time period, at least a portion of the streaming data using a first code module, wherein the streaming data comprises a plurality of data tuples where each of the plurality of data tuples comprises a plurality of attribute value pairs, wherein the first code module processes a first attribute value pair of the plurality of attribute value pairs select, based on the measured data flow rate, an inactive code module stored in a first one of the plurality of processing elements processing the streaming data, wherein the selected code module is maintained in an inactive state until the data flow rate satisfies a predefined threshold; and activate, during a second time period, the selected code module on the first plurality of processing elements such that a second attribute value pair of the plurality of attribute value pairs in the streaming data received by the first processing element is processed by the selected code module, wherein the second time period occurs after the first time period, wherein the first code module processes the first attribute value pair during the second time period, and wherein the first attribute value pair is different from the second attribute value pair. - View Dependent Claims (9, 10, 11, 12, 13, 14, 16)
Specification