System and method for relationship based root cause recommendation
First Claim
1. A method of identifying a root cause in a distributed computing environment, comprising:
- generating a call graph including a plurality of nodes by merging topology relationship data, transaction tracking relationship data and metric correlation relationship data of the plurality of nodes, wherein the topology relationship data includes data regarding a physical distance between nodes of the plurality of nodes positioned in different geographic locations;
traversing the plurality of nodes in the call graph starting with an end user node, wherein each node corresponds to an application component in the distributed computing environment;
calculating a response time between pairs of neighboring nodes from among the plurality of nodes, wherein the neighboring nodes in each pair are connected to each other in the call graph;
calculating a weight for each of a plurality of edges connecting the neighboring nodes in the pairs based on the calculated response time between pairs of neighboring nodes among the plurality of nodes, wherein the weight of each edge is calculated based on a correlation between (i) the response time between the neighboring nodes in the corresponding pair and (ii) a response time between the neighboring node furthest from the end user node and the end user node;
traversing all of the nodes in the call graph starting with the end user node in an order based on the weight of each of the plurality of edges;
calculating a root cause score for each node in the call graph based on traversing all of the nodes in the call graph in the order based on the weight of each of the plurality of edges;
generating a ranked list comprising all of the nodes in an order based on the root cause score of each node; and
generating a recommendation to repair at least one node of all of the nodes in the ranked list, wherein the at least one node corresponds to an application component that acts as a system bottleneck in the distributed computing environment.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of identifying a root cause in a distributed computing environment includes traversing a plurality of nodes in a call graph starting with an end user node. Each node corresponds to an application component. A response time is calculated between connected pairs of neighboring nodes. A weight is calculated for each of a plurality of edges connecting the neighboring nodes. The nodes are traversed starting with the end user node in an order based on the weight of each of the edges. A root cause score is calculated for each node based on traversing all of the nodes in the order based on the weight of each of the edges. A ranked list is generated.
67 Citations
15 Claims
-
1. A method of identifying a root cause in a distributed computing environment, comprising:
-
generating a call graph including a plurality of nodes by merging topology relationship data, transaction tracking relationship data and metric correlation relationship data of the plurality of nodes, wherein the topology relationship data includes data regarding a physical distance between nodes of the plurality of nodes positioned in different geographic locations; traversing the plurality of nodes in the call graph starting with an end user node, wherein each node corresponds to an application component in the distributed computing environment; calculating a response time between pairs of neighboring nodes from among the plurality of nodes, wherein the neighboring nodes in each pair are connected to each other in the call graph; calculating a weight for each of a plurality of edges connecting the neighboring nodes in the pairs based on the calculated response time between pairs of neighboring nodes among the plurality of nodes, wherein the weight of each edge is calculated based on a correlation between (i) the response time between the neighboring nodes in the corresponding pair and (ii) a response time between the neighboring node furthest from the end user node and the end user node; traversing all of the nodes in the call graph starting with the end user node in an order based on the weight of each of the plurality of edges; calculating a root cause score for each node in the call graph based on traversing all of the nodes in the call graph in the order based on the weight of each of the plurality of edges; generating a ranked list comprising all of the nodes in an order based on the root cause score of each node; and generating a recommendation to repair at least one node of all of the nodes in the ranked list, wherein the at least one node corresponds to an application component that acts as a system bottleneck in the distributed computing environment. - View Dependent Claims (2, 3, 4, 5, 6, 7, 15)
-
-
8. A method of identifying a root cause in a distributed computing environment, comprising:
-
generating a call graph including a plurality of nodes by merging topology relationship data, transaction tracking relationship data and metric correlation relationship data of the plurality of nodes, wherein the topology relationship data includes data regarding a physical distance between nodes of the plurality of nodes positioned in different geographic locations; traversing the plurality of nodes in the call graph starting with an end user node, wherein each node corresponds to an application component in the distributed computing environment; calculating a throughput between pairs of neighboring nodes from among the plurality of nodes, wherein the neighboring nodes in each pair are connected to each other in the call graph; calculating a weight for each of a plurality of edges connecting the neighboring nodes in the pairs based on the throughput between pairs of neighboring nodes from among the plurality of nodes, wherein the weight of each edge is calculated based on a correlation between (i) the response time between the neighboring nodes in the corresponding pair and (ii) a response time between the neighboring node furthest from the end user node and the end user node; traversing all of the nodes in the call graph starting with the end user node in an order based on the weight of each of the plurality of edges; calculating a root cause score for each node in the call graph based on traversing all of the nodes in the call graph in the order based on the weight of each of the plurality of edges; generating a ranked list comprising all of the nodes in an order based on the root cause score of each node; and generating a recommendation to repair at least one node of all of the nodes in the ranked list, wherein the at least one node corresponds to an application component that acts as a system bottleneck in the distributed computing environment. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A method of identifying a root cause in a distributed computing environment, comprising:
-
generating a call graph including a plurality of nodes by merging topology relationship data, transaction tracking relationship data and metric correlation relationship data of the plurality of nodes, wherein the topology relationship data includes data regarding a physical distance between nodes of the plurality of nodes positioned in different geographic locations; traversing the plurality of nodes in the call graph starting with an end user node, wherein each node corresponds to an application component in the distributed computing environment; calculating a packet loss rate between pairs of neighboring nodes from among the plurality of nodes, wherein the neighboring nodes in each pair are connected to each other in the call graph; calculating a weight for each of a plurality of edges connecting the neighboring nodes in the pairs based on the packet loss rate between pairs of neighboring nodes from among the plurality of nodes, wherein the weight of each edge is calculated based on a correlation between (i) the response time between the neighboring nodes in the corresponding pair and (ii) a response time between the neighboring node furthest from the end user node and the end user node; traversing all of the nodes in the call graph starting with the end user node in an order based on the weight of each of the plurality of edges, wherein the order in which all of the nodes in the call graph are traversed is a highest weight to lowest weight order; calculating a root cause score for each node in the call graph based on traversing all of the nodes in the call graph in the order based on the weight of each of the plurality of edges; generating a ranked list comprising all of the nodes in an order based on the root cause score of each node, wherein a node having a highest root cause score is a first node in the ranked list and a node having a lowest root cause score is a last node in the ranked list; and generating a recommendation to repair at least one node of all of the nodes in the ranked list, wherein the at least one node corresponds to an application component that acts as a system bottleneck in the distributed computing environment. - View Dependent Claims (14)
-
Specification