Hybrid agent-oriented object model to provide software fault tolerance between distributed processor nodes
First Claim
1. A computer system implementing an extended distributed recovery block fault tolerance scheme comprising a supervisory node, an active node and a standby node wherein each said active and standby node comprises:
- a primary routine for executing a software function;
an alternate routine for executing said software function;
an acceptance test routine for testing the output of said primary routine and providing a control signal in response thereto;
a device driver for receiving said control signal;
a monitor for communicating state information with one or more active or standby nodes, and a node manager for determining the operational configuration of said node, such that said primary routine is executed in response to a determination that said node is in an active state and said alternate routine is executed in response to a determination that said node is in a standby state, and wherein said supervisory node coordinates the operation of said active node and said standby node, the improvement wherein the primary and alternate routines of one of said active or standby node are implemented with an application task comprising a plurality of agent objects each operating as a finite state machine operating in either a primary mode executing said primary routine or in an alternate mode executing said alternate routine.
2 Assignments
0 Petitions
Accused Products
Abstract
An apparatus and method for a computer system is used for implementing an extended distributed recovery block fault tolerance scheme. The computer system includes a supervisory node, an active node and a standby node. Each of the nodes has a primary routine, an alternate routine and an acceptance test for testing the output of the routines. Each node also includes a device driver, a monitor and a node manager for determining the operational configuration of the node. The supervisory node coordinates the operation of the active and standby nodes. The primary and alternate routines are implemented with an application task through a plurality of agent objects operating as finite state machines. A reliable data link extends between the monitors of the active and standby nodes.
-
Citations
50 Claims
-
1. A computer system implementing an extended distributed recovery block fault tolerance scheme comprising a supervisory node, an active node and a standby node wherein each said active and standby node comprises:
-
a primary routine for executing a software function;
an alternate routine for executing said software function;
an acceptance test routine for testing the output of said primary routine and providing a control signal in response thereto;
a device driver for receiving said control signal;
a monitor for communicating state information with one or more active or standby nodes, and a node manager for determining the operational configuration of said node, such that said primary routine is executed in response to a determination that said node is in an active state and said alternate routine is executed in response to a determination that said node is in a standby state, and wherein said supervisory node coordinates the operation of said active node and said standby node, the improvement wherein the primary and alternate routines of one of said active or standby node are implemented with an application task comprising a plurality of agent objects each operating as a finite state machine operating in either a primary mode executing said primary routine or in an alternate mode executing said alternate routine. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 48, 49)
-
-
16. A computer system implementing an extended distributed recovery block fault tolerance scheme comprising a supervisory node, an active node, and a standby node,
the improvement wherein primary and alternate routines of said active and standby nodes are each implemented with a plurality of dedicated application tasks each comprising a plurality of agent objects each operating as a finite state machine operating in either a primary mode executing said primary routine or in an alternate mode executing said alternate routine, and wherein the determination of the mode of operation of the agents in a one of said plural dedicated application tasks is determined independently of the mode of operation of the agents in the other of said plural dedicated application tasks.
-
26. A computer system implementing an extended distributed recovery block fault tolerance scheme comprising a supervisory node, an active node, and a standby node,
the improvement wherein the primary and alternate routines of said active and standby nodes are each implemented with a plurality of dedicated application tasks each comprising a plurality of agent objects each operating as a finite state machine operating in either a primary mode executing said primary routine or in an alternate mode executing said alternate routine, and wherein each of said agents is implemented with an attachment list comprising data common to the attachment list of at least one other agent.
Specification