×

Operations controller for a fault tolerant multiple node processing system

  • US 4,933,940 A
  • Filed: 05/12/1989
  • Issued: 06/12/1990
  • Est. Priority Date: 04/15/1987
  • Status: Expired due to Term
First Claim
Patent Images

1. In a multiple node fault tolerant processing system having a plurality of nodes wherein each node has an applications processor for executing a predetermined set of tasks and an operations controller for controlling its own node in coordination with all of the other nodes of said plurality of nodes through the exchange of inter-node messages and wherein said operations controller selects the tasks to be executed by the applications processor from said predetermined set of tasks, each operations controller having a plurality of subsystems including a message checker, a scheduler, a synchronizer and a voter, each of which is capable of detecting errors and generating internal error reports identifying each error detected, each operations controller further having at least two operating system states and operative to switch from one operating system state to another in response to the exclusion of a faulty node or the readmittance of a healthy node which changes the number of nodes operating in the processing system, a fault tolerator for said operations controller comprising:

  • a message memory storing the content of all inter-node messages received by said operations controller;

    an error file storing the content of said internal error reports generated by said message checker, said scheduler, said synchronizer and said voter;

    error handler means for storing said error reports in said error file and for generating a base penalty count for each node of said plurality of nodes from the content of said error file, said base penalty count being indicative of the operational status of the associated node, said error handler means further having means for determining which nodes are faulty and for excluding such faulty nodes from participating in the operation of said multiple node processing system, in coordination with all of the other nodes in the system, through the exchange of inter-node messages, said inter-node messages including error messages containing the content of said error file for a particular node and a base penalty count message containing said base penalty count of each node; and

    interface means for storing all of the messages passed by the message checker in said message memory, for passing the identities of the faulty nodes to the scheduler and the synchronizer, and for passing all error reports to said error handler.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×