Method and apparatus for restarting a computing system
First Claim
1. Apparatus for restarting a computing system following interruption, including a plurality of resource managers that manage resource collections such as data bases, non-volatile storage means containing a recovery log on which are recorded checkpointed states and images of changes resulting from the execution of work unit instructions, a subset of said resource managers having responsibilities in the execution of work unit instructions and the recovery of work units during restart, said apparatus comprising:
- first data structure means for recording for an interrupted work unit the location of a respective section of the recovery log which contains information of its activities and the recovery responsibility of each resource manager to the work unit;
second data structure means for recording for each resource manager its operational state and the location of a respective different section of the recovery log which contains information that particular resource manager needs to perform its recovery responsibility; and
recovery means which receives and is responsive to information recorded in said first and second data structure means for restarting selected resource managers following interruption and deferring the restarting of other resource managers.
1 Assignment
0 Petitions
Accused Products
Abstract
A programming method and structure for operating a computing system to restart a total subsystem or a subset of that subsystem to an operable state following a total interruptions (system failure or termination, either normal or abnormal). The subsystem isolates inoperable resources while permitting the others to resume by independently maintaining in a first structure recording the completion state of a resource manager'"'"'s recovery responsibility with respect to each interrupted work unit and in a second structure the operational states and recovery log interest scopes of each resource manager. The completion state can be influenced by the starting or not of a resource manager, and if restarted, the presence or absence of a resource subset required to accomplish the work unit recovery.
-
Citations
16 Claims
-
1. Apparatus for restarting a computing system following interruption, including a plurality of resource managers that manage resource collections such as data bases, non-volatile storage means containing a recovery log on which are recorded checkpointed states and images of changes resulting from the execution of work unit instructions, a subset of said resource managers having responsibilities in the execution of work unit instructions and the recovery of work units during restart, said apparatus comprising:
-
first data structure means for recording for an interrupted work unit the location of a respective section of the recovery log which contains information of its activities and the recovery responsibility of each resource manager to the work unit; second data structure means for recording for each resource manager its operational state and the location of a respective different section of the recovery log which contains information that particular resource manager needs to perform its recovery responsibility; and recovery means which receives and is responsive to information recorded in said first and second data structure means for restarting selected resource managers following interruption and deferring the restarting of other resource managers. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for restarting a computing system following interruption, including a plurality of resource managers that manage resource collections such as data bases, a recovery log in non-volatile storage on which are recorded checkpointed states and images of changes resulting from the execution of work unit instructions, a subset of said resource managers having responsibilities in the execution of work unit instructions and the recovery of work units during restart, said method comprising the steps of:
-
recording in a first data structure means for an interrupted work unit the location of a respective section of the recovery log which contains information of its activities and the recovery responsibility of each resource manager to the interrupted work unit; recording in a second data structure means for each resource manager the operational state of such resource manager and the location of a respective section of the recovery log which contains information such resource manager needs to perform its recovery responsibility; and responsive to information recorded in said first and second data structure means, restarting selected resource managers following interruption while deferring the restarting of other resource managers. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A method for facilitating the restarting of a computing system that is interrupted from time to time, the system having a plurality of resource managers that manage resource collections such as data bases, a recovery log in non-volatile storage on which are recorded checkpointed states and images of changes resulting from the execution of work unit instructions, a subset of said resource managers having responsibilities in the execution of work unit instructions and the recovery of work units after interruption, said method comprising the steps of:
-
(a) recording for each interrupted work unit the location of a respective section of the recovery log which contains information of its activities and the recovery responsibility of each resource manager to the work unit; (b) recording for each resource manager of said subset;
(1) its operational state and (2) the location of a respective different section of the recovery log which contains the information that particular resource manager needs to perform its recovery responsibility; and(c) following each interrupt, restarting at least one resource manager in response to the information recorded in steps (a) and (b). - View Dependent Claims (16)
-
Specification