Defining a computer recovery process that matches the scope of outage including determining a root cause and performing escalated recovery operations
First Claim
Patent Images
1. A computer program product for facilitating recovery in an Information Technology (IT) environment, said computer program product comprising:
- a non-transitory computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising;
programmatically analyzing, at failure time, information relating to a failure within the IT environment to determine which resource of a plurality of resources is the resource corresponding to a root cause of the failure, said information being related to at least one of;
one or more resources impacted by the failure, one or more implications of the failure, or one or more resources degraded by the failure, wherein the programmatically analyzing comprises iteratively analyzing the information to determine which resource is the resource corresponding to the root cause;
programmatically determining, at failure time, based on the programmatically analyzing, the root cause for the failure; and
programmatically defining, at failure time, by a processor, one or more resources to be included in a set of resources to be recovered and one or more operations to be used in recovering the set of resources based on the analyzed information and the determined root cause, wherein said programmatically defining comprises;
determining, at failure time, the one or more resources affected by the failure, the determining based on the analyzed information and the root cause;
including the one or more resources determined at failure time to be affected by the failure in the set of resources to be recovered, wherein the set of resources is commensurate with a scope of the failure, as determined at failure time, in that the set of resources includes only those resources affected by the failure; and
determining one or more operations to be performed on the set of resources, wherein the determining takes into consideration at least one of;
an effect an operation has on a resource of the set of resources on which the operation is performed, an impact on at least one other resource of the set of resources, or a time it takes to perform the operation, wherein the determining the one or more operations is iterative, and wherein an operation selected to be used in recovery is an escalated operation having an increased severity, in response to a previous operation failing.
1 Assignment
0 Petitions
Accused Products
Abstract
Recovery processing is defined that matches the scope of an outage. A programmatic analysis of the resources that have been impacted, of implications of the failure and what degradations have occurred is performed to construct an appropriate level of recovery. This includes selecting the appropriate set of resources to be recovered. Recovery operations are selected based on the current state of the environment.
-
Citations
20 Claims
-
1. A computer program product for facilitating recovery in an Information Technology (IT) environment, said computer program product comprising:
a non-transitory computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising; programmatically analyzing, at failure time, information relating to a failure within the IT environment to determine which resource of a plurality of resources is the resource corresponding to a root cause of the failure, said information being related to at least one of;
one or more resources impacted by the failure, one or more implications of the failure, or one or more resources degraded by the failure, wherein the programmatically analyzing comprises iteratively analyzing the information to determine which resource is the resource corresponding to the root cause;programmatically determining, at failure time, based on the programmatically analyzing, the root cause for the failure; and programmatically defining, at failure time, by a processor, one or more resources to be included in a set of resources to be recovered and one or more operations to be used in recovering the set of resources based on the analyzed information and the determined root cause, wherein said programmatically defining comprises; determining, at failure time, the one or more resources affected by the failure, the determining based on the analyzed information and the root cause; including the one or more resources determined at failure time to be affected by the failure in the set of resources to be recovered, wherein the set of resources is commensurate with a scope of the failure, as determined at failure time, in that the set of resources includes only those resources affected by the failure; and determining one or more operations to be performed on the set of resources, wherein the determining takes into consideration at least one of;
an effect an operation has on a resource of the set of resources on which the operation is performed, an impact on at least one other resource of the set of resources, or a time it takes to perform the operation, wherein the determining the one or more operations is iterative, and wherein an operation selected to be used in recovery is an escalated operation having an increased severity, in response to a previous operation failing.- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
15. A computer system for facilitating recovery in an Information Technology (IT) environment, said computer system comprising:
-
a memory; and a processor in communication with the memory, wherein the computer system is configured to perform a method, said method comprising; programmatically analyzing, at failure time, information relating to a failure within the IT environment, said information being related to at least one of;
one or more resources impacted by the failure, one or more implications of the failure, or one or more resources degraded by the failure;programmatically determining, at failure time, based on the programmatically analyzing, a root cause for the failure; and programmatically determining a set of resources to be recovered and one or more recovery operations to be used in recovering the set of resources based on the analyzed information and the determined root cause, said set of resources being commensurate with a scope of the failure and said one or more recovery operations being selected based on a current state of the IT environment, wherein the determining the one or more recovery operations is iterative, and wherein a recovery operation selected to be used in recovery is an escalated operation having an increased severity, in response to a previous recovery operation failing. - View Dependent Claims (16)
-
-
17. A method of facilitating recovery in an Information Technology (IT) environment, the method comprising:
-
programmatically analyzing, at failure time, information relating to a failure within the IT environment, said information being related to at least one of;
one or more resources impacted by the failure, one or more implications of the failure, or one or more resources degraded by the failure;programmatically determining, at failure time, based on the programmatically analyzing, a root cause for the failure; and programmatically determining, by a processor, a set of resources to be recovered and one or more recovery operations to be used in recovering the set of resources based on the analyzed information and the determined root cause, said set of resources being commensurate with a scope of the failure and said one or more recovery operations being selected based on a current state of the IT environment, wherein the determining the one or more recovery operations is iterative, and wherein a recovery operation selected to be used in recovery is an escalated operation having an increased severity, in response to a previous recovery operation failing. - View Dependent Claims (18, 19)
-
-
20. A computer program product for facilitating recovery in an Information Technology (IT) environment, the computer program product comprising:
a non-transitory computer readable storage medium readable by a processor and storing instructions for execution by the processor for performing a method comprising; programmatically analyzing, at failure time, information relating to a failure within the IT environment, said information being related to at least one of;
one or more resources impacted by the failure, one or more implications of the failure, or one or more resources degraded by the failure;programmatically determining, at failure time, based on the programmatically analyzing, a root cause for the failure; and programmatically determining a set of resources to be recovered and one or more recovery operations to be used in recovering the set of resources based on the analyzed information and the determined root cause, said set of resources being commensurate with a scope of the failure and said one or more recovery operations being selected based on a current state of the IT environment, wherein the determining the one or more recovery operations is iterative, and wherein a recovery operation selected to be used in recovery is an escalated operation having an increased severity, in response to a previous recovery operation failing.
Specification