Method, apparatus and system to automate detection of anomalies for storage and replication within a high availability disaster recovery environment
First Claim
1. A method comprising:
- obtaining a plurality of parameters associated with each of a first storage resource of a first cluster and a second storage resource of a second cluster, whereinthe first cluster is operable to provide computing services to one or more client systems,the computing services comprise an application that is hosted by the first cluster,the application hosted by the first cluster uses data that is stored by the first cluster,the second cluster comprises replicated data that is replicated from the data on the first cluster,the second cluster is operable to provide the computing services to the client system(s) responsive to a failover from the first cluster,the first storage resource of the first cluster stores the data used by the application, andthe second storage resource of the second cluster stores the replicated data that is operable to be used by the application when hosted by the second cluster,the plurality of parameters associated with the first and second storage resources comprise first and second logical unit numbers, respectively;
detecting, as a function of the plurality of parameters, at least one anomaly of the first cluster or the second cluster whereinthe detecting is performed after the data is replicated to the second cluster as the replicated data,the at least one anomaly indicates a mismatch between the first storage resource associated with the first logical unit number and the second storage resource associated with the second logical unit numbers; and
generating an alert in response to detecting the at least one anomaly.
7 Assignments
0 Petitions
Accused Products
Abstract
A method, apparatus and system for improving failover within a high-availability computer system are provided. The method includes obtaining one or more parameters associated with at least one resource of any of the first cluster, second cluster and high-availability computer system. The method also includes detecting, as a function of the parameters, one or more anomalies of any of the first cluster, second cluster and high-availability computer system, wherein the at least one anomaly is a type that impacts the failover. These anomalies may include anomalies within the first and/or second clusters (“intra-cluster anomalies”) and/or anomalies among the first and second clusters (“inter-cluster anomalies”). The method further includes generating an alert in response to detecting one or more of the anomalies.
25 Citations
14 Claims
-
1. A method comprising:
-
obtaining a plurality of parameters associated with each of a first storage resource of a first cluster and a second storage resource of a second cluster, wherein the first cluster is operable to provide computing services to one or more client systems, the computing services comprise an application that is hosted by the first cluster, the application hosted by the first cluster uses data that is stored by the first cluster, the second cluster comprises replicated data that is replicated from the data on the first cluster, the second cluster is operable to provide the computing services to the client system(s) responsive to a failover from the first cluster, the first storage resource of the first cluster stores the data used by the application, and the second storage resource of the second cluster stores the replicated data that is operable to be used by the application when hosted by the second cluster, the plurality of parameters associated with the first and second storage resources comprise first and second logical unit numbers, respectively; detecting, as a function of the plurality of parameters, at least one anomaly of the first cluster or the second cluster wherein the detecting is performed after the data is replicated to the second cluster as the replicated data, the at least one anomaly indicates a mismatch between the first storage resource associated with the first logical unit number and the second storage resource associated with the second logical unit numbers; and generating an alert in response to detecting the at least one anomaly. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus comprising:
-
a parameter definition module comprising a plurality of parameters associated with each of a first storage resource of a first cluster and a second storage resource of the first cluster, wherein the first cluster is operable to provide computing services to one or more client systems, the computing services comprise an application that is hosted by the first cluster, the application hosted by the first cluster uses data that is stored by the first cluster, the second storage resource comprises replicated data that is replicated from data on the first storage resource on the first cluster, the first cluster is operable to use the second storage resource when providing the computing services to the client system(s) responsive to a failover, the first storage resource stores the data used by the application, and the second storage resource stores the replicated data that is operable to be used by the application, the plurality of parameters associated with the first and second storage resources comprise first and second logical unit numbers, respectively; an anomaly detection module adapted to detect, as a function of the plurality of parameters, at least one anomaly of the first cluster or the second cluster, wherein the anomaly detection module is adapted to detect the at least one anomaly after the data is replicated to the second cluster as the replicated data, and the at least one anomaly indicates a mismatch between the first storage resource associated with the first logical unit number and the second storage resource associated with the second logical unit numbers; and an alert generator adapted to generate an alert in response to the detection of the at least one anomaly. - View Dependent Claims (9, 10, 11)
-
-
12. A system comprising:
-
a first cluster having a first storage resource used by an application, wherein the first cluster is operable to provide computing services to one or more client systems, the computing services comprise the application that is hosted by the first cluster, and the application hosted by the first cluster uses data that is stored by the first cluster using the first storage resource; a second cluster having a second storage resource operable to be used by a replication of the application, wherein the data is replicated from the first cluster to the second cluster as replicated data, the second storage resource is operable to store the replicated data operable to be used by the replicated application, and the second cluster is operable to provide the computing services to the client system(s) responsive to a failover from the first cluster; a parameter definition module comprising parameters associated with the first and second storage resources comprising; (i) a first logical unit number (LUN) associated with the first storage resource, and (ii) a second LUN associated with the second storage resource; an anomaly detection module adapted to detect, as a function of any of the parameters, at least one anomaly, wherein the anomaly detection module is adapted to detect the at least one anomaly after the data is replicated to the second cluster as the replicated data, and the at least one anomaly indicates a mismatch between the first storage resource associated with the first LUN and the second storage resource associated with the second LUN; and an alert generator adapted to generate an alert responsive to the detection of at least one anomaly. - View Dependent Claims (13, 14)
-
Specification