Method of spare capacity use for fault detection in a multiprocessor system
First Claim
1. A method of detecting faults in a multiprocessor system in which processing tasks are executed by plural ones of the processors and the results from each of the processors compared, CHARACTERTIZED BY the steps ofascertaining if one or more of the processors are idle response to an initiation of a computing task and, if so,assigning the task to a primary idle one of the processors if an idle processor is available,assigning the task to a secondary processor if a second idle processor is available,if a secondary processor is assigned,comparing the task results of the primary and secondary processors,setting an indication in a disagreement table that the primary and secondary processors disagree if the task results of the primary and secondary processors do not compare, andperiodically analyzing indications in the disagreement table according to a predetermined algorithm to determine faulty ones of the processors.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of detecting and identifying faulty processors in a multiprocessor system. A processing task is assigned when possible to two processors, a primary processor and a secondary processor if a second idle processor is available. In one embodiment, the operations of a secondary processor are preempted and that processor is reassigned as primary for another task if no idle processor is available when the task is initiated. In a second embodiment, the operations of a secondary processor are not preempted, and a new task is queued until an idle processor becomes available. If a secondary processor completes a task, its results are compared with the results of the primary processor. Disagreement messages are broadcast to a central controller in a nondistributed embodiment and to all the processors in the system in a distributed embodiment. The disagreement messages are periodically analyzed by the controller or by each processor. In the distributed embodiment, the analysis algorithm is such that each nonfaulty processor identifies the same subset of faulty processors.
38 Citations
21 Claims
-
1. A method of detecting faults in a multiprocessor system in which processing tasks are executed by plural ones of the processors and the results from each of the processors compared, CHARACTERTIZED BY the steps of
ascertaining if one or more of the processors are idle response to an initiation of a computing task and, if so, assigning the task to a primary idle one of the processors if an idle processor is available, assigning the task to a secondary processor if a second idle processor is available, if a secondary processor is assigned, comparing the task results of the primary and secondary processors, setting an indication in a disagreement table that the primary and secondary processors disagree if the task results of the primary and secondary processors do not compare, and periodically analyzing indications in the disagreement table according to a predetermined algorithm to determine faulty ones of the processors.
Specification