Detection of data flow bottlenecks and disruptions based on operator timing profiles in a parallel processing environment
First Claim
1. A computer implemented method for detecting data flow disruptions over a series of data processing operators that are each configured to receive and store data in an input record block, process data from the input record block, store results of the processing in an output record block, and output data from the output record block to a next processing operator in the series, the method comprising:
- generating, for a particular processing operator in the series of data processing operators, a processing operator timing profile that includes;
an input wait time based upon a period of time that a particular data set is stored in a particular input data record,an operator processing time based upon a period of time between a start of processing of the particular data set by the particular processing operator and a completion of the processing of the particular data set by the particular processing operator, andan output wait time based upon a period of time that the particular data set is stored in a particular output data record block;
detecting, from the processing operator timing profile, a potential flow disruption condition;
determining that the processing operator timing profile satisfies at least one rule from a set of flow disruption rules that are each associated with at least one corresponding recommendation;
identifying, based on the at least one rule, a corresponding recommendation; and
displaying, in response to identifying the corresponding recommendation, an identity of the particular processing operator and the corresponding recommendation, wherein the series of data processing operators are part of a system of operators working in a parallel processing environment, wherein the parallel processing environment comprises a conductor, a plurality of section leaders, and a plurality of players, and wherein a section leader of the plurality of section leaders is configured to create a record block I/O monitoring thread to implement the generating, for the particular processing operator in the series of processing operators, the processing operator timing profile in the parallel processing environment.
1 Assignment
0 Petitions
Accused Products
Abstract
Data flow disruptions over a series of data processing operators can be detected by a computer system that generates a profile for data flow at an operator. The profile can include data input, processing, and output wait times. Using the profile, the system can detect potential flow disruptions. If the potential disruption satisfies a rule, it is considered a data flow disruption and a recommendation associated with the satisfied rule is identified. The recommendation and the operator identity is displayed.
-
Citations
9 Claims
-
1. A computer implemented method for detecting data flow disruptions over a series of data processing operators that are each configured to receive and store data in an input record block, process data from the input record block, store results of the processing in an output record block, and output data from the output record block to a next processing operator in the series, the method comprising:
-
generating, for a particular processing operator in the series of data processing operators, a processing operator timing profile that includes; an input wait time based upon a period of time that a particular data set is stored in a particular input data record, an operator processing time based upon a period of time between a start of processing of the particular data set by the particular processing operator and a completion of the processing of the particular data set by the particular processing operator, and an output wait time based upon a period of time that the particular data set is stored in a particular output data record block; detecting, from the processing operator timing profile, a potential flow disruption condition; determining that the processing operator timing profile satisfies at least one rule from a set of flow disruption rules that are each associated with at least one corresponding recommendation; identifying, based on the at least one rule, a corresponding recommendation; and displaying, in response to identifying the corresponding recommendation, an identity of the particular processing operator and the corresponding recommendation, wherein the series of data processing operators are part of a system of operators working in a parallel processing environment, wherein the parallel processing environment comprises a conductor, a plurality of section leaders, and a plurality of players, and wherein a section leader of the plurality of section leaders is configured to create a record block I/O monitoring thread to implement the generating, for the particular processing operator in the series of processing operators, the processing operator timing profile in the parallel processing environment. - View Dependent Claims (2, 3)
-
-
4. A computer system for detecting data flow disruptions over a series of data processing operators that are each configured to receive and store data in an input record block, process data from the input record block, store results of the processing in an output record block, and output data from the output record block to a next processing operator in the series, the system comprising:
at least one processor circuit configured to; generate, for a particular processing operator in the series of data processing operators, a processing operator timing profile that includes; an input wait time based upon a period of time that a particular data set is stored in a particular input data record, an operator processing time based upon a period of time between a start of processing of the particular data set by the particular processing operator and a completion of the processing of the particular data set by the particular processing operator, and an output wait time based upon a period of time that the particular data set is stored in a particular output data record block; detect, from the processing operator timing profile, a potential flow disruption condition; determine that the processing operator timing profile satisfies at least one rule from a set of flow disruption rules that are each associated with at least one corresponding recommendation; identify, based on the at least one rule, a corresponding recommendation; and display, in response to identifying the corresponding recommendation, an identity of the particular processing operator and the corresponding recommendation, wherein the series of data processing operators are part of a system of operators working in a parallel processing environment, wherein the parallel processing environment comprises a conductor, a plurality of section leaders, and a plurality of players, and wherein a section leader of the plurality of section leaders is configured to create a record block I/O monitoring thread to implement the generating, for the particular processing operator in the series of processing operators, the processing operator timing profile in the parallel processing environment. - View Dependent Claims (5, 6)
-
7. A computer program product for detecting data flow disruptions over a series of data processing operators that are each configured to receive and store data in an input record block, process data from the input record block, store results of the processing in an output record block, and output data from the output record block to a next processing operator in the series, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions executable by a computer processing circuit to cause the circuit to perform the method comprising:
-
generating, for a particular processing operator in the series of data processing operators, a processing operator timing profile that includes; an input wait time based upon a period of time that a particular data set is stored in a particular input data record, an operator processing time based upon a period of time between a start of processing of the particular data set by the particular processing operator and a completion of the processing of the particular data set by the particular processing operator, and an output wait time based upon a period of time that the particular data set is stored in a particular output data record block; detecting, from the processing operator timing profile, a potential flow disruption condition; determining that the processing operator timing profile satisfies at least one rule from a set of flow disruption rules that are each associated with at least one corresponding recommendation; identifying, based on the at least one rule, a corresponding recommendation; and displaying, in response to identifying the corresponding recommendation, an identity of the particular processing operator and the corresponding recommendation, wherein the series of data processing operators are part of a system of operators working in a parallel processing environment, wherein the parallel processing environment comprises a conductor, a plurality of section leaders, and a plurality of players, and wherein a section leader of the plurality of section leaders is configured to create a record block I/O monitoring thread to implement the generating, for the particular processing operator in the series of processing operators, the processing operator timing profile in the parallel processing environment. - View Dependent Claims (8, 9)
-
Specification