×

Checkpointing mechanism for fault-tolerant systems

  • US 5,235,700 A
  • Filed: 01/16/1991
  • Issued: 08/10/1993
  • Est. Priority Date: 02/08/1990
  • Status: Expired due to Term
First Claim
Patent Images

1. A checkpointing mechanism allowing the working process of an active processor to be resumed by a backup processor when the active processor is failing, said checkpointing mechanism being associated with at least one pair of information processing units comprising a first and a second information processing units (10-1 and 10-2), each information processing unit including a processor (12-1, 12-2) which runs a program stored in a memory (14-1,14-2) attached to the processor through a memory bus (16-1, 16-2) comprising data, address and control lines and which can be set in an active, backup or fail status under control of a configuration controller (22) responsive to failure detecting means (18-1, 18-2) associated to each processor and detecting whether the associated processor is failing or not, the checkpointing mechanism being characterized in that it comprises:

  • first memory change detecting means (28-1, 28-2) associated with at least the information processing unit (12-1, 12-2) whose processor is initially set in the active status by the configuration controller to receive the address and data on the memory bus causing the memory content to be changed and generate memory change records therefrom,first signalling means in said information processing unit whose processor is initially set in the active status by the configuration controller, responsive to a signal provided by said processor at selected points of the program to generate an establish recovery point signal,first storing means (32-1, 32-2) associated with at least the information processing unit whose processor is initially set in the backup status by the communication controller, said first storing means being coupled to said first memory change detecting means to store the memory change records received from said first memory change detecting means,first control means (30-1, 30-2) associated with said first storing means and responsive to the establish recovery point signal received from the first signalling means to cause a separating record to be stored in the first storing means and the memory change records to be read from the first storing means and written in the memory of the information processing unit whose processor is initially set in the backup status, as long as separating records are stored in the first storing means, whereby when set in an active status, the backup processor can resume the working process of the active status processor when its status is switched from the active status to the fail status.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×