Methods and apparatus for fault recovery
First Claim
1. A computer error recovery system comprisingan error table for storing each identified error in said computer, an action list of all actions to be taken to correct the identified error, and an error count increment value for each said action on said action list,an action table for storing, for each possible corrective action to be taken, an error count threshold value which, when exceeded, triggers the action corresponding to said error count threshold,means, responsive to an identified error, for accumulation error count increments, andmeans, utilizing said error and said action table and response to accumulated error count increments, for automatically initiating each corrective action for which said accumulated count exceeds the count threshold.
7 Assignments
0 Petitions
Accused Products
Abstract
An error recovery system for computers is shown in which an error table and an action table are used to control the parameters of the error recovery system. The error table has one entry for each possible error and contains a count increment for each corrective action that might be taken to correct that error. The action table includes an error count threshold for each possible corrective action. The system operates to accumulate error count increments against possible actions and, when the corresponding threshold is exceeded, initiates the corrective action. Since the table contents are easily modified, the recovery strategy can be updated and modified by the computer user without changing the recovery system programs.
102 Citations
8 Claims
-
1. A computer error recovery system comprising
an error table for storing each identified error in said computer, an action list of all actions to be taken to correct the identified error, and an error count increment value for each said action on said action list, an action table for storing, for each possible corrective action to be taken, an error count threshold value which, when exceeded, triggers the action corresponding to said error count threshold, means, responsive to an identified error, for accumulation error count increments, and means, utilizing said error and said action table and response to accumulated error count increments, for automatically initiating each corrective action for which said accumulated count exceeds the count threshold.
-
4. A method for recovering from errors occurring in a computer system, said method comprising the steps of
storing an easily editable error table in said computer, said error table including, for each identified error in said computer, an action list of all actions to be taken to correct the identified error, and an error count increment for that action, storing an easily editable action table in said computer, said action table including, for each possible corrective action to be taken, an error count threshold which, when exceeded, triggers an action corresponding to said error count threshold, accumulating the error counts for each identified error against all actions on said action list for said each identified error and automatically initiating a particular corrective action when the accumulated count for that particular action exceeds the count threshold for said particular action.
-
7. A table-driven error recovery subsystem for data processing system comprising
a user-editable error table including each possible error that might occur in said system, a user-editable action table including each possible corrective action that might be useful in said system, and means automatically responsive to the contents of said tables for carrying out a multistaged recovery strategy for said data processing system.
Specification