METHODS, SYSTEMS, AND MEDIA TO CORRELATE ERRORS ASSOCIATED WITH A CLUSTER
First Claim
1. A method for correlating error events of a cluster, the method comprising:
- identifying systems of the cluster potentially impacted by an error based upon a topology of the cluster, wherein a first system of the systems of the cluster includes two ports for communicating with other systems of the cluster, wherein the topology of the cluster comprises loop data describing a loop including the first system and a second system in the systems of the cluster via the two ports;
identifying, from the error events, an error event associated with the first system; and
selecting the error event based upon error identification data associated with the error event, wherein selecting the error event comprises comparing the error identification data with the loop data to identify an error in the loop including the first system and the second system in the systems of the cluster, to report the error to a maintenance provider.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and media for correlating error events of a cluster are disclosed. Embodiments may identify systems of a cluster potentially impacted by an error and identify one or more error events associated with those systems. Then, embodiments may select one of the identified error events based upon data associated with the identified error event, disregarding other identified error events generated for the same error or errors symptomatic of the error, to report the error to a maintenance provider via a single error event. Many embodiments may identify one or more error events potentially resulting from the same error by identifying error events within a specified time period of the event that triggered the correlation. Several embodiments correlate the error events in an environment that is substantially independent of the cluster. Further embodiments obtain data that describes system interconnections of the cluster and generate a topology based upon the data.
-
Citations
18 Claims
-
1. A method for correlating error events of a cluster, the method comprising:
-
identifying systems of the cluster potentially impacted by an error based upon a topology of the cluster, wherein a first system of the systems of the cluster includes two ports for communicating with other systems of the cluster, wherein the topology of the cluster comprises loop data describing a loop including the first system and a second system in the systems of the cluster via the two ports; identifying, from the error events, an error event associated with the first system; and selecting the error event based upon error identification data associated with the error event, wherein selecting the error event comprises comparing the error identification data with the loop data to identify an error in the loop including the first system and the second system in the systems of the cluster, to report the error to a maintenance provider. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus for correlating error events of a cluster, the apparatus comprising:
-
a system identifier coupled with the cluster to identify systems of the cluster potentially impacted by an error based upon a topology of the cluster, wherein a first system of the systems of the cluster includes two ports for communicating with other systems of the cluster, wherein the topology of the cluster comprises loop data describing a loop including the first system and a second system in the systems of the cluster; an event identifier coupled with the system identifier to identify, from the error events, an error event associated with the first system; and an event selector coupled with the event identifier to select the error event based upon error identification data associated with the error event, wherein selecting the error event comprises comparing the error identification data with the loop data to identify an error in the loop including the first system and the second system of the cluster, to report the error to a maintenance provider. - View Dependent Claims (12, 13, 14)
-
-
15. A computer readable storage medium containing a program which, when executed, performs an operation, comprising:
-
identifying systems of a cluster potentially impacted by an error based upon a topology of the cluster, wherein a first system of the cluster includes two ports for communicating with other systems of the cluster, wherein the topology of the cluster comprises loop data describing a loop including the first system and a second system in the systems of the cluster; identifying, from error events generated by the cluster, an error event associated with the first system; and selecting the error event based upon error identification data associated with the error event, wherein selecting the error event comprises comparing the error identification data with the loop data to identify an error in the loop including the first system and the second system in the systems of the cluster, to report the error to a maintenance provider. - View Dependent Claims (16, 17, 18)
-
Specification