Durable atomic storage update manager
First Claim
1. In a computer system including a central processor, a volatile memory, and a storage device for persistently storing data, and capable of executing a program, a method for ensuring proper storage of data intended to be persistently stored, the method comprising the following steps during normal runtime execution:
- (1) decomposing the program into a plurality of agents;
(2) associating a chosen said agent with a transaction or subtransaction to be executed;
(3) providing to each said agent an agent-callable first service for establishing a sequential relationship among each said transaction and subtransaction within a fault tolerant structure;
(4) providing to each said agent an agent-callable second service for storing on said storage device a log containing information necessary to enable recovery from a transaction or subtransaction encountering a fault;
(5) each said agent executing said associated transaction or subtransaction by calling said first and second agent-callable services, wherein said associated transaction or subtransaction can generate a further subtransaction during execution, and wherein said second agent-callable service stores recovery information in said log when data modification for said associated transaction or subtransaction is completed;
(6) atomically maintaining each said transaction or subtransaction in said fault tolerant structure that contains information relating each said agent with a said associated transaction or subtransaction;
(7) providing each of a plurality of said agents with distinct first agent-specific fault recovery procedures for redoing data modifications produced by execution of said each agent and distinct second agent-specific fault recovery procedures for undoing said data modifications produced by execution of said each agent; and
(8) when recovering from a fault during execution of said program, sequentially executing said first agent-specific fault recovery procedures for those of said agents having recovery information stored in said log when predefined fault recovery criteria are met and sequentially executing said second agent-specific fault recovery procedures for those of said agents having recovery information stored in said log when said predefined fault recovery criteria are not met.
3 Assignments
0 Petitions
Accused Products
Abstract
According to a first aspect of the invention, a DASUM (Durable Atomic Storage Update Manager) provides an extensible framework assuring complex changes to persistent storage of data within a computer system, including a distributed computer system. During normal runtime, modifications to permanent storage are broken down and organized as a plurality of simpler transactions. These simpler transactions are accomplished atomically by executing associated agents within the computer program under execution. Each agent need only have the ability to complete its own process, and need not be able to deal with side effects from other transactions. Without needing to know what steps may be required, each agent supplies three agent-specific procedures that can be called during recovery from a fault. The DASUM provides seven services that, during normal transaction execution, can store information in a logger necessary for recovery from a fault. The recovery information stored in the logger can be used to replicate a dynamic tree-like fault tolerant update set that is maintained by the DASUM on an atomic basis. According to a second aspect of the invention, the DASUM provides recovery from the effects of incompletely executed transactions in the event of a fault. During fault recovery, the DASUM calls the agent specific procedures, as needed, using the recovery and recovery sequence information stored during normal transaction execution. The present invention advantageously permits separating the agents from the logger, simplifies logger design, and improves durability of the data to be persistently stored.
-
Citations
3 Claims
-
1. In a computer system including a central processor, a volatile memory, and a storage device for persistently storing data, and capable of executing a program, a method for ensuring proper storage of data intended to be persistently stored, the method comprising the following steps during normal runtime execution:
-
(1) decomposing the program into a plurality of agents; (2) associating a chosen said agent with a transaction or subtransaction to be executed; (3) providing to each said agent an agent-callable first service for establishing a sequential relationship among each said transaction and subtransaction within a fault tolerant structure; (4) providing to each said agent an agent-callable second service for storing on said storage device a log containing information necessary to enable recovery from a transaction or subtransaction encountering a fault; (5) each said agent executing said associated transaction or subtransaction by calling said first and second agent-callable services, wherein said associated transaction or subtransaction can generate a further subtransaction during execution, and wherein said second agent-callable service stores recovery information in said log when data modification for said associated transaction or subtransaction is completed; (6) atomically maintaining each said transaction or subtransaction in said fault tolerant structure that contains information relating each said agent with a said associated transaction or subtransaction; (7) providing each of a plurality of said agents with distinct first agent-specific fault recovery procedures for redoing data modifications produced by execution of said each agent and distinct second agent-specific fault recovery procedures for undoing said data modifications produced by execution of said each agent; and (8) when recovering from a fault during execution of said program, sequentially executing said first agent-specific fault recovery procedures for those of said agents having recovery information stored in said log when predefined fault recovery criteria are met and sequentially executing said second agent-specific fault recovery procedures for those of said agents having recovery information stored in said log when said predefined fault recovery criteria are not met. - View Dependent Claims (2, 3)
-
Specification