Fault tolerant computer system
First Claim
1. A method for providing a fault tolerant computer system comprising the steps of:
- providing a first processing means for operation of said computer system, said first processing means comprising a first operating system (OS) engine and a first input/output (I/O) engine;
providing a second processing means, said second processing means comprising a second operating system (OS) engine and a second input/output (I/O) engine;
determining a state of said first processing means and providing said state to said second processing means;
defining an operation that can change said state of said first OS engine as an event;
providing a plurality of events to said first I/O engine and converting each of said events into a message;
providing said message to a first message queue in said first OS engine and to a second message queue in said second OS engine;
executing said message in said first OS engine and said second OS engine;
switching said computer system operation to said second processing means upon failure of said first processing means, such that no loss of operation of said computer system occurs during said switchover.
5 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for providing a fault-tolerant backup system such that if there is a failure of a primary processing system, a replicated system can take over without interruption. The invention provides a software solution for providing a backup system. Two servers are provided, a primary and secondary server. The two servers are connected via a communications channel. The servers have associated with them an operating system. The present invention divides this operating system into two "engines." An I/O engine is responsible for handling and receiving all data and asynchronous events on the system. The I/O engine controls and interfaces with physical devices and device drivers. The operating system (OS) engine is used to operate on data received from the I/O engine. All events or data which can change the state of the operating system are channeled through the I/O engine and converted to a message format. The I/O engine on the two servers coordinate with each other and provide the same sequence of messages to the OS engines. The messages are provided to a message queue accessed by the OS engine. Therefore, regardless of the timing of the events, (i.e., asynchronous events), the OS engine receives all events sequentially through a continuous sequential stream of input data. As a result, the OS engine is a finite state automata with a one-dimensional input "view" of the rest of the system and the state of the OS engines on both primary and secondary servers will converge.
370 Citations
32 Claims
-
1. A method for providing a fault tolerant computer system comprising the steps of:
-
providing a first processing means for operation of said computer system, said first processing means comprising a first operating system (OS) engine and a first input/output (I/O) engine; providing a second processing means, said second processing means comprising a second operating system (OS) engine and a second input/output (I/O) engine; determining a state of said first processing means and providing said state to said second processing means; defining an operation that can change said state of said first OS engine as an event; providing a plurality of events to said first I/O engine and converting each of said events into a message; providing said message to a first message queue in said first OS engine and to a second message queue in said second OS engine; executing said message in said first OS engine and said second OS engine; switching said computer system operation to said second processing means upon failure of said first processing means, such that no loss of operation of said computer system occurs during said switchover. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A fault tolerant computer system comprising:
-
first processing means for operation of said computer system, said first processing means comprising a first operating system (OS) engine and a first input/output (I/O) engine; second processing means comprising a second operating system (OS) engine and a second input/output (I/O) engine; said first I/O engine coupled to said second I/O engine on a first bus; said first I/O engine including a converting means for converting operations that can change said state of said first OS engine into a message; said first I/O engine for providing said message to a first message queue in said first OS engine and to a second message queue in said second OS engine; said first OS engine and said second OS engine including means for executing said message; means for switching said computer system operation to said second OS engine upon failure of said first processing means such that no loss of operation of said computer system occurs during said switchover. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for providing a fault tolerant computer system comprising the steps of:
-
providing a first processing means for operation of said computer system, said first processing means comprising a first operating system (OS) engine and a first input/output (I/O) engine; providing a second processing means comprising a second operating system (OS) engine and a second input/output (I/O) engine; determining a state of said first processing means and providing said state to said second processing means; defining an operation that can change said state of said first OS engine as an event; providing a plurality of events to said first I/O engine and serializing said events into an event sequence; providing successive events in said event sequence to said first OS engine and to said second OS engine; executing said successive events in said first OS engine and said second OS engine, switching said computer system operation to said second processing means upon failure of said first processing means, such that no loss of operation to said computer system occurs during said switchover. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
-
24. A fault tolerant computer system comprising:
-
first processing means for operation of said computer system, said first processing means comprising a first operating system (OS) engine and a first input/output (I/O) engine; second processing means comprising a second operating system (OS) engine and a second input/output (I/O) engine; said first I/O engine coupled to said second I/O engine on a first bus; said first I/O engine including a converting means for converting operations that can change said state of said first OS engine into an operation sequence; said first I/O engine for providing said operations in sequence to said first OS engine and to said second OS engine; said first OS engine and said second OS engine including means for executing said operations; means for switching said computer system operation to said second OS engine upon failure of said first processing means such that no loss of operation of said computer system occurs during said switchover. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32)
-
Specification