Annotation of network activity through different phases of execution
First Claim
1. A system, comprising:
- processor; and
memory including instructions that, when executed by the processor, cause the system to;
receive information for a job to be processed by a distributed application, the job being submitted from a user or other application and having at least two phases of execution for completion of the job;
identify a set of network elements to monitor during processing of the job, the set of network elements corresponding to (a) nodes that are involved in at least a first phase of the job and (b) one or more level of network hierarchy nodes to include potential problems that are created from network higher up in the hierarchy;
monitor, over a period of time, the set of network elements during processing of the job for the at least two phases of execution;
detect a failure during at least one phase of execution in at least one network element;
determine a recursive impact zone of all network elements connected to the at least one network element, in response to detecting the failure;
flag the network elements of the recursive impact zone; and
generate job profile data indicating at least the failure and the network elements of the recursive impact zone;
indicate, in a graphical representation, the failure in the at least one network element based at least in part on the job profile data;
wherein the recursive impact zone includes all further adjoining network elements connected to the at least one network element in a hierarchy to the at least one network element, but does not include adjoining network elements not in the hierarchy to the at least one network element.
1 Assignment
0 Petitions
Accused Products
Abstract
The subject technology provides a drillable time-series heat map, which combines information of separate network element (e.g., switch, router, server or storage) and relates them together through impact zones to correlate network wide events and the potential impact on the other units in the network. The subject technology also brings together the network and its components, the distributed application(s) and a heat map controller to proactively communicate with one another to disseminate information such as failures, timeouts, new jobs, etc. For an individual job (e.g., for a distributed application), the subject technology may monitor consumption of resources during different phases of execution to provide individual job profile data that could be presented as a drillable heat map. The heat map, in this regard, nay include resource utilization heat metrics of resources such as CPU, Input/Output (I/O), memory, etc., in the heat map or graphs and presented along with network activity.
422 Citations
20 Claims
-
1. A system, comprising:
-
processor; and memory including instructions that, when executed by the processor, cause the system to; receive information for a job to be processed by a distributed application, the job being submitted from a user or other application and having at least two phases of execution for completion of the job; identify a set of network elements to monitor during processing of the job, the set of network elements corresponding to (a) nodes that are involved in at least a first phase of the job and (b) one or more level of network hierarchy nodes to include potential problems that are created from network higher up in the hierarchy; monitor, over a period of time, the set of network elements during processing of the job for the at least two phases of execution; detect a failure during at least one phase of execution in at least one network element; determine a recursive impact zone of all network elements connected to the at least one network element, in response to detecting the failure; flag the network elements of the recursive impact zone; and generate job profile data indicating at least the failure and the network elements of the recursive impact zone; indicate, in a graphical representation, the failure in the at least one network element based at least in part on the job profile data; wherein the recursive impact zone includes all further adjoining network elements connected to the at least one network element in a hierarchy to the at least one network element, but does not include adjoining network elements not in the hierarchy to the at least one network element. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented method, comprising:
-
receiving information for a job to be processed by a distributed application, the job being submitted from a user or other application and having at least two phases of execution for completion of the job; identifying a set of network elements to monitor during processing of the job, the set of network elements corresponding to (a) nodes that are involved in at least a first phase of the job and (b) one or more level of network hierarchy nodes to include potential problems that are created from network higher up in the hierarchy; monitoring, over a period of time, the set of network elements during processing of the job for the at least two phases of execution; detecting a failure during at least one phase of execution in at least one network element; determining a recursive impact zone of all network elements connected to the at least one network element; and generating job profile data indicating at least the failure; indicating, in a graphical representation, the failure in the at least one network element based at least in part on the job profile data; wherein the recursive impact zone includes all further adjoining network elements connected to the at least one network element in a hierarchy to the at least one network element, but does not include adjoining network elements not in the hierarchy to the at least one network element. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer-readable medium including instructions stored therein that, when executed by a computing device, cause the at least one computing device to:
-
receive information for a job to be processed by a distributed application, the job being submitted from a user or other application and having at least two phases of execution for completion of the job; identify a set of network elements to monitor during processing of the job, the set of network elements corresponding to (a) nodes that are involved in at least a first phase of the job and (b) one or more level of network hierarchy nodes to include potential problems that are created from network higher up in the hierarchy; monitor, over a period of time, the set of network elements during processing of the job for the at least two phases of execution; detect a failure during at least one phase of execution in at least one network element; determine a recursive impact zone of all network elements connected to the at least one network element, in response to detecting the failure; flag the network elements of the recursive impact zone; and generate job profile data indicating at least the failure and the network elements of the recursive impact zone; indicate, in a graphical representation, the failure in the at least one network element based at least in part on the job profile data; wherein the recursive impact zone includes all further adjoining network elements connected to the at least one network element in a hierarchy to the at least one network element, but does not include adjoining network elements not in the hierarchy to the at least one network element. - View Dependent Claims (18, 19, 20)
-
Specification