Durable atomic storage update manager
First Claim
1. A computer system, comprising:
- a central processor capable of executing a data processing program;
a volatile memory; and
a storage device for persistently storing data and said data processing program;
said storage device storing said data processing program as a plurality of agent programs, each said agent program having an associated transaction or subtransaction to be executed;
said storage device further storing;
an agent-callable first service program for establishing a sequential relationship among each said transaction and subtransaction within a fault tolerant data structure;
an agent-callable second service program for storing on said storage device a log containing information necessary to enable recovery from a transaction or subtransaction encountering a fault;
at least a subset of said agent programs each including program portions that when executed by said central processing unit executes said associated transaction or subtransaction by calling said first and second agent-callable service programs, wherein said execution of said associated transaction or subtransaction can generate a further subtransaction, and wherein said second agent-callable service program stores recovery information in said log when data modification for said transaction or subtransaction is completed;
distinct first agent-specific fault recovery procedures, for each of a plurality of said agent programs, for redoing data modifications produced by execution of said each agent program; and
distinct second agent-specific fault recovery procedures, for each of said plurality of said agent programs, for undoing said data modifications produced by execution of said each agent program; and
a fault recovery program for execution by said central processing unit when said data processing program is recovering from a fault during execution of said data processing program, said fault recovery program including program portions for sequentially executing said first agent-specific fault recovery procedures for those of said agent programs having recovery information stored in said log when predefined fault recovery criteria are met and for sequentially executing said second agent-specific fault recovery procedures for those of said agent programs having recovery information stored in said log when said predefined fault recovery criteria are not met.
2 Assignments
0 Petitions
Accused Products
Abstract
According to a first aspect of the invention, a DASUM (Durable Atomic Storage Update Manager) provides an extensible framework assuring complex changes to persistent storage of data within a computer system, including a distributed computer system. During normal runtime, modifications to permanent storage are broken down and organized as a plurality of simpler transactions. These simpler transactions are accomplished atomically by executing associated agents within the computer program under execution. Each agent need only have the ability to complete its own process, and need not be able to deal with side effects from other transactions. Without needing to know what steps may be required, each agent supplies three agent-specific procedures that can be called during recovery from a fault. The DASUM provides seven services that, during normal transaction execution, can store information in a logger necessary for recovery from a fault. The recovery information stored in the logger can be used to replicate a dynamic tree-like fault tolerant update set that is maintained by the DASUM on an atomic basis. According to a second aspect of the invention, the DASUM provides recovery from the effects of incompletely executed transactions in the event of a fault. During fault recovery, the DASUM calls the agent specific procedures, as needed, using the recovery and recovery sequence information stored during normal transaction execution. The present invention advantageously permits separating the agents from the logger, simplifies logger design, and improves durability of the data to be persistently stored.
140 Citations
2 Claims
-
1. A computer system, comprising:
-
a central processor capable of executing a data processing program; a volatile memory; and a storage device for persistently storing data and said data processing program; said storage device storing said data processing program as a plurality of agent programs, each said agent program having an associated transaction or subtransaction to be executed; said storage device further storing; an agent-callable first service program for establishing a sequential relationship among each said transaction and subtransaction within a fault tolerant data structure; an agent-callable second service program for storing on said storage device a log containing information necessary to enable recovery from a transaction or subtransaction encountering a fault; at least a subset of said agent programs each including program portions that when executed by said central processing unit executes said associated transaction or subtransaction by calling said first and second agent-callable service programs, wherein said execution of said associated transaction or subtransaction can generate a further subtransaction, and wherein said second agent-callable service program stores recovery information in said log when data modification for said transaction or subtransaction is completed; distinct first agent-specific fault recovery procedures, for each of a plurality of said agent programs, for redoing data modifications produced by execution of said each agent program; and distinct second agent-specific fault recovery procedures, for each of said plurality of said agent programs, for undoing said data modifications produced by execution of said each agent program; and a fault recovery program for execution by said central processing unit when said data processing program is recovering from a fault during execution of said data processing program, said fault recovery program including program portions for sequentially executing said first agent-specific fault recovery procedures for those of said agent programs having recovery information stored in said log when predefined fault recovery criteria are met and for sequentially executing said second agent-specific fault recovery procedures for those of said agent programs having recovery information stored in said log when said predefined fault recovery criteria are not met. - View Dependent Claims (2)
-
Specification