Organizing network performance metrics into historical anomaly dependency data
First Claim
1. A method of organizing network performance metrics into historical anomaly dependency data, the method including:
- assembling performance data for a multiplicity of metrics across a multiplicity of resources on a network and automatically setting criteria based on the performance data over time that qualifies a subset of the performance data as anomalous instance data;
constructing a map of active network communication paths that carry communications among first and second resources subject to anomalous performance and representing the active network communication paths as edges between nodes representing first and second resources, thereby forming connected node pairs;
calculating cascading failure relationships from time-stamped anomalous instance data for the connected node pairs, wherein the cascading failure relationships are based at least in part on whether conditional probabilities of anomalous performance of the second resources given prior anomalous performance of the first resources exceed a predetermined threshold;
wherein calculating the conditional probabilities makes use of a statistical measure of likelihood;
conditional probability=p(anomalous second resource instance|anomalous first resource instance); and
automatically representing the anomalous performance of the second resource as a cascading failure resulting from the anomalous performance of the first resource based on the calculated cascading failure relationships.
6 Assignments
0 Petitions
Accused Products
Abstract
The technology disclosed relates to organizing network performance metrics into historical anomaly dependency data. In particular, it relates to calculating cascading failure relationships between correlated anomalies detected in a network. It also relates to illustrating to a network administrator causes of system failure by laying out the graph to show a progression over time of the cascading failures and identify root causes of the cascading failures. It also relates to ranking anomalies and anomaly clusters in the network based on attributes of the resources exhibiting anomalous performances and attributes of the anomalous performances. It further relates to depicting evolution of resource failures across a network by visually coding impacted resources and adjusting the visual coding over time and allowing replay over time to visualize propagation of anomalous performances among the impacted resource.
26 Citations
20 Claims
-
1. A method of organizing network performance metrics into historical anomaly dependency data, the method including:
-
assembling performance data for a multiplicity of metrics across a multiplicity of resources on a network and automatically setting criteria based on the performance data over time that qualifies a subset of the performance data as anomalous instance data; constructing a map of active network communication paths that carry communications among first and second resources subject to anomalous performance and representing the active network communication paths as edges between nodes representing first and second resources, thereby forming connected node pairs; calculating cascading failure relationships from time-stamped anomalous instance data for the connected node pairs, wherein the cascading failure relationships are based at least in part on whether conditional probabilities of anomalous performance of the second resources given prior anomalous performance of the first resources exceed a predetermined threshold; wherein calculating the conditional probabilities makes use of a statistical measure of likelihood;
conditional probability=p(anomalous second resource instance|anomalous first resource instance); andautomatically representing the anomalous performance of the second resource as a cascading failure resulting from the anomalous performance of the first resource based on the calculated cascading failure relationships. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of illustrating to a network administrator causes of system failure, the method including:
generating for display a cluster of operation anomalies that are interrelated as cascading failures in an anomaly impact graph, including; depicting anomalous instance data in the cluster as nodes in a plot; representing active network communication paths that carry communications among first and second resources subject to anomalous performances as edges between the nodes, thereby forming connected node pairs; and depicting at least part of the plot to show a progression over time of the cascading failures for the connected node pairs and to identify one or more root causes of the cascading failures. - View Dependent Claims (13, 14, 15, 16, 17)
-
18. A method of illustrating to a network administrator causes of system failure, the method including:
generating for display an anomaly impact graph interface that depicts a cluster of operation anomalies that are interrelated as cascading failures, including; nodes in a diagram that represent anomalous instance data for different resources in the cluster; edges between the nodes that represent active network communication path data for communications among first and second resources, wherein the edges and nodes form connected node pairs; and arrangement of the diagram that shows progression over time of cascading failure result links between anomalous performances of the first and second resources occurring within a predetermined time period. - View Dependent Claims (19, 20)
Specification