System and method for memory failure recovery using lockstep processes
First Claim
1. A method for memory failure recovery in a computer, comprising:
- maintaining a predetermined number of duplicate and primary processes in the computer;
keeping the processes in synchronization;
managing the processes so that a single process image is presented to an external environment;
detecting a computer exception which affects one of the processes; and
terminating the affected process.
5 Assignments
0 Petitions
Accused Products
Abstract
A system and method for memory failure recovery is disclosed. The method discloses the steps of maintaining a predetermined number of duplicate and primary processes; keeping the processes in synchronization; managing the processes so that a single process image is presented to an external environment; detecting a computer system exception which affects one of the processes; and terminating the affected process. The system discloses, a primary process memory space which hosts a primary process; a duplicate process memory space which hosts a duplicate process corresponding to the primary process; a synchronization buffer which keeps the duplicate process in synchronization with the primary process; a processor which generates an exception signal in response to detection of a memory failure condition which affects the primary process; and an operating system which receives the exception signal, terminates the affected primary process, and maintains a predetermined number of primary and duplicate processes.
35 Citations
29 Claims
-
1. A method for memory failure recovery in a computer, comprising:
-
maintaining a predetermined number of duplicate and primary processes in the computer; keeping the processes in synchronization; managing the processes so that a single process image is presented to an external environment; detecting a computer exception which affects one of the processes; and terminating the affected process. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for memory failure recovery, comprising:
-
maintaining a predetermined number of duplicate and primary processes; keeping the processes in synchronization; managing the processes so that a single process image is presented to an external environment; detecting a computer system exception which affects one of the processes; and terminating the affected process; wherein the maintaining element includes, identifying a primary process; monitoring a fault-tolerance value corresponding to the primary process; and setting a number of duplicate processes equal to the fault-tolerance value; and wherein the managing element includes, permitting only one of the processes to perform a system call to an external environment. - View Dependent Claims (17, 18)
-
-
19. A computer-usable medium embodying computer program code for commanding a computer to perform memory failure recovery comprising:
-
maintaining a predetermined number of duplicate and primary processes in the computer; keeping the processes in synchronization; managing the processes so that a single process image is presented to an external environment; detecting a computer system exception which affects one of the processes; and terminating the affected process. - View Dependent Claims (20, 21, 22, 23, 24)
-
-
25. A system for memory failure recovery in a computer, comprising:
-
means for maintaining a predetermined number of duplicate and primary processes in the computer; means for keeping the processes in synchronization; means for managing the processes so that a single process image is presented to an external environment; means for detecting a computer exception which affects one of the processes; and
means for terminating the affected process.
-
-
26. A system for memory failure recovery, comprising:
-
a primary process memory space hosting a primary process; a duplicate process memory space hosting a duplicate process corresponding to the primary process; a synchronization buffer for keeping the duplicate process in synchronization with the primary process; a processor for generating an exception signal in response to detection of a memory failure condition which affects the primary process; and an operating system for receiving the exception signal, terminating the affected primary process, and maintaining a predetermined number of primary and duplicate processes. - View Dependent Claims (27, 28, 29)
-
Specification