Nested recovery scope management for stateless recovery agents
First Claim
1. A nested recovery scope management system for use in a computer system having a dynamic multiple address space server environment, the system comprising:
- a processor to execute;
a supervisory program for directing recovery of protected resources, the supervisory program including a first subcomponent for initializing a recovery log, and a second subcomponent for storing failure scope tokens and recovery agent references in the recovery log, a third subcomponent for examining the recovery log for agents registered therewith and any current failure scope, and a fourth component for selectively notifying the recovery agent to carry out a recovery procedure;
the recovery log used by the supervisory program for storing information about recovery agents registered within a recovery scope;
a first recoverable component having therein a first recoverable subcomponent for generating work identifiers containing the failure scope token or associated with such token that represents recoverable operations that may need to be performed at a later time, and a second recoverable subcomponent for registering the recovery agent with the supervisory program; and
a first stateless recovery agent identified in response to the initialization of the first recoverable component, the first stateless recovery agent being operable to assist in performing recovery operations in connection with the first recoverable component when instructed to do so by the supervisory program, the first stateless recovery agent also being operable to utilize work identifiers from the recovery log in order to perform recovery operations in connection with a first associated component;
wherein the supervisory program generates tokens used to reference stored work identifiers associated with recoverable components, andthe recovery agent is also operable to utilize the token received from the supervisory program as well as the stored work identifiers in the recovery log in order to perform recoverable operations in connection with at least the first associated componentwherein the supervisory program directs removal of work identifiers and token information associated with the first recoverable component from the recovery log when recovery operations associated with the first recoverable component have been successfully completed.
1 Assignment
0 Petitions
Accused Products
Abstract
Nested recovery scope management systems and methods for a multiple process computer system having a dynamic multiple address space server are disclosed. Stateless recovery agents are employed, under the control of a supervisory program called Recovery Director, during initialization or restart of servers to restore recoverable data in response to identified failures or other abnormal termination. The Director controls the recovery of protected resources in a systematic manner. The Director is initialized when a first address space of a first server is started. Then, as each instance of a recoverable component is initialized, the component registers with the Director by providing a reference to a stateless recovery agent that can later perform recovery functions for it if needed. As part of the registration, a token representing the current failure scope of the registration is generated and provided to the recoverable component by the Director.
-
Citations
13 Claims
-
1. A nested recovery scope management system for use in a computer system having a dynamic multiple address space server environment, the system comprising:
-
a processor to execute; a supervisory program for directing recovery of protected resources, the supervisory program including a first subcomponent for initializing a recovery log, and a second subcomponent for storing failure scope tokens and recovery agent references in the recovery log, a third subcomponent for examining the recovery log for agents registered therewith and any current failure scope, and a fourth component for selectively notifying the recovery agent to carry out a recovery procedure; the recovery log used by the supervisory program for storing information about recovery agents registered within a recovery scope; a first recoverable component having therein a first recoverable subcomponent for generating work identifiers containing the failure scope token or associated with such token that represents recoverable operations that may need to be performed at a later time, and a second recoverable subcomponent for registering the recovery agent with the supervisory program; and a first stateless recovery agent identified in response to the initialization of the first recoverable component, the first stateless recovery agent being operable to assist in performing recovery operations in connection with the first recoverable component when instructed to do so by the supervisory program, the first stateless recovery agent also being operable to utilize work identifiers from the recovery log in order to perform recovery operations in connection with a first associated component; wherein the supervisory program generates tokens used to reference stored work identifiers associated with recoverable components, and the recovery agent is also operable to utilize the token received from the supervisory program as well as the stored work identifiers in the recovery log in order to perform recoverable operations in connection with at least the first associated component wherein the supervisory program directs removal of work identifiers and token information associated with the first recoverable component from the recovery log when recovery operations associated with the first recoverable component have been successfully completed.
-
-
2. A nested recovery scope management system for use in a computer system having a dynamic multiple address space server environment, the system comprising:
-
a processor to execute; a supervisory program for directing recovery of protected resources, the supervisory program including a first subcomponent for initializing a recovery log, and a second subcomponent for storing failure scope tokens and recovery agent references in the recovery log, a third subcomponent for examining the recovery log for agents registered therewith and any current failure scope, and a fourth component for selectively notifying the recovery agent to carry out a recovery procedure; the recovery log used by the supervisory program for storing information about recovery agents registered within a recovery scope; a first recoverable component having therein a first recoverable subcomponent for generating work identifies containing the failure scope token or associated with such token that represents recoverable operations to be performed at a later time, and a second recoverable subcomponent for registering the recovery agent with the supervisory program; and a first stateless recovery agent identified in response to initialization of the first recoverable component, the first stateless recovery agent being operable to assist in performing recoverable operations in connections with the first recoverable component when instructed to do so by the supervisory program, the first stateless recovery agent also being operable to utilize work identifiers from the recovery log in order to perform recovery operations in connection with a first associated component; wherein the supervisory program generates tokens used to reference stored work identifiers associated with recoverable components; the recovery agent is also operable to utilize a token received from the supervisory program as well as the stored work identifiers in the recovery log in order to perform recoverable operations in connection with at least the first associated component; wherein the supervisory program filters duplicate recovery agent references found in recovery scopes stored within the recovery log; wherein the supervisory program builds a recovery agent iterator that utilizes at least a first ordered list of agents to be dispatched to perform recovery for an identified failure scope that may be in need of recovery successively, upon the recovery agent reporting back from an earlier recovery assignment. - View Dependent Claims (3)
-
-
4. A recovery scope management method for use in a multiple process computer system having a plurality of server regions assigned to a shared resource group which regions include a dynamic multiple address space, the method being for assisting in the recovery and restoration of protected resources available to the shared resource group, the method comprising:
-
(a) installing a supervisory program for directing the recovery of protected resources whose processing was abnormally terminated, and initializing the supervisory program when a first address space of a server region within the computer system is started; (b) as each instance of a recoverable component is initialized, registering that component with the supervisory program by providing the supervisory program a reference to a stateless recovery agent that is able to perform recovery functions associated with that instance of the recoverable component; (c) for each recoverable component, creating work identifiers for recoverable operations to be performed if an abnormal termination occurs using at least in part information associated with the reference or token provided to the recoverable component; (d) employing the supervisory program to identify and group multiple instances of the same recoverable component; (e) upon the occurrence of the abnormal termination, having the supervisory program detennine a recovery scope by examining stored data for failure scopes to identify data processing operations that may not have been completed and have not been recovered, which identified data processing operations are referred to as incomplete failure scopes, and (f) as incomplete failure scopes are identified, having the supervisory program obtain the reference to each stateless recovery agent associated with such scopes; (g) after obtaining each such reference, using the supervisory program to call each such stateless recovery agent and providing such agent with the token representing the recovery scope; and (h) passing control to the recovery agent to allow the rccovery agent to perform recovery for the specified recovery scope it received via the token, including any recovery scopes nested within it. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer program product, for and to be used within a multiple processor computer system having a shared dynamic multiple address space server enviromnent, in order to implement a nested recovery scope management method which employs at least one stateless agent, the software product comprising:
- a storage medium readable by at least one processing circuit and storing instructions for execution by the processing circuit for performing a method comprising the steps of
(a) installing a supervisory program for directing the recovery of protected resources whose processing was abnormally terminated, and initializing the supervisory program when a first address space of a sewer within the computer system is started; (b) as each instance of a recoverable component is initialized, registering that component with the supervisory program, by providing the supervisory program with a reference to a stateless recovery agent that is able to perform recovery functions associated with that instance of the recoverable component; (c) for each recoverable component, creating work identifiers for recoverable operations to be performed if an abnormal termination occurs using at least in part information associated with the reference provided to the recoverable component; (d) employing the supervisory program to identify and group multiple instances of the same recoverable component; (e) upon the occurrence of the abnormal termination, having the supervisory program determine a recovery scope by examining stored data for failure scopes to identify data processing operations that may not have been completed and have not been recovered, which identified data processing operations are referred to as incomplete failure scopes, and (f) as incomplete failure scopes are identified, having the supervisory program obtain the reference to each stateless recovery agent associated with such scopes; (g) after obtaining each such reference, using the supervisory program to call each such stateless recovery agent and providing such agent with a token to the recovery scope; and (h) passing control to the recovery agent to allow the recovery agent to perform recovery for the specified recovery scope it received via the token, including any recovery scopes nested within it.
- a storage medium readable by at least one processing circuit and storing instructions for execution by the processing circuit for performing a method comprising the steps of
Specification