System and method for memory failure recovery using lockstep processes
First Claim
1. A method for memory failure recovery, comprising:
- maintaining a predetermined number of duplicate and primary processes;
keeping the processes in synchronization;
managing the processes so that a single process image is presented to an external environment;
detecting a computer system exception which affects one of the processes; and
terminating the affected process.
5 Assignments
0 Petitions
Accused Products
Abstract
A system and method for memory failure recovery is disclosed. The method discloses the steps of maintaining a predetermined number of duplicate and primary processes; keeping the processes in synchronization; managing the processes so that a single process image is presented to an external environment; detecting a computer system exception which affects one of the processes; and terminating the affected process. The system discloses, a primary process memory space which hosts a primary process; a duplicate process memory space which hosts a duplicate process corresponding to the primary process; a synchronization buffer which keeps the duplicate process in synchronization with the primary process; a processor which generates an exception signal in response to detection of a memory failure condition which affects the primary process; and an operating system which receives the exception signal, terminates the affected primary process, and maintains a predetermined number of primary and duplicate processes.
27 Citations
25 Claims
-
1. A method for memory failure recovery, comprising:
-
maintaining a predetermined number of duplicate and primary processes;
keeping the processes in synchronization;
managing the processes so that a single process image is presented to an external environment;
detecting a computer system exception which affects one of the processes; and
terminating the affected process. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 21)
-
-
15. A method for memory failure recovery, comprising:
-
maintaining a predetermined number of duplicate and primary processes;
keeping the processes in synchronization;
managing the processes so that a single process image is presented to an external environment;
detecting a computer system exception which affects one of the processes; and
terminating the affected process;
wherein the maintaining element includes, identifying a primary process;
monitoring a fault-tolerance value corresponding to the primary process; and
setting a number of duplicate processes equal to the fault-tolerance value; and
wherein the managing element includes, permitting only one of the processes to perform a system call to an external environment.
-
-
16. A data structure for memory failure recovery within a computer system, comprising the fields of:
-
a primary process field, for identifying primary processes within the computer system; and
a fault-tolerance variable field, for identifying a predetermined number of duplicate processes, corresponding to the primary processes, to be maintained within the computer system.
-
-
17. A computer-usable medium embodying computer program code for commanding a computer to perform memory failure recovery comprising:
-
maintaining a predetermined number of duplicate and primary processes;
keeping the processes in synchronization;
managing the processes so that a single process image is presented to an external environment;
detecting a computer system exception which affects one of the processes; and
terminating the affected process. - View Dependent Claims (18, 19, 20)
-
-
22. A system for memory failure recovery, comprising:
-
means for maintaining a predetermined number of duplicate and primary processes;
means for keeping the processes in synchronization;
means for managing the processes so that a single process image is presented to an external environment;
means for detecting a computer system exception which affects one of the processes; and
means for terminating the affected process.
-
-
23. A system for memory failure recovery, comprising:
-
a primary process memory space hosting a primary process;
a duplicate process memory space hosting a duplicate process corresponding to the primary process;
a synchronization buffer for keeping the duplicate process in synchronization with the primary process;
a processor for generating an exception signal in response to detection of a memory failure condition which affects the primary process; and
an operating system for receiving the exception signal, terminating the affected primary process, and maintaining a predetermined number of primary and duplicate processes. - View Dependent Claims (24, 25)
-
Specification