×

Methods and apparatus using commutative error detection values for fault isolation in multiple node computers

  • US 7,383,490 B2
  • Filed: 04/14/2005
  • Issued: 06/03/2008
  • Est. Priority Date: 04/14/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A signal-bearing computer memory medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform node fault detection operations in a computing system using commutative error detection values, where the computing system comprises a plurality of nodes, where each of the nodes comprises at least a node processor, a node memory, a network interface and a commutative error detection apparatus;

  • the computing system further comprising a network connecting the plurality of nodes through the network interfaces of the nodes, and wherein node fault detection occurs when the computing system executes at least a portion of an application program at least twice, wherein during each execution of the portion of the application program at least one commutative error detection value is generated and saved to the commutative error detection apparatus associated with at least one node of the plurality when data generated during execution of a reproducible segment of the portion of the application program is injected into the network by the at least one node, the node fault detection operations comprising;

    retrieving the at least one commutative error detection value generated during a first execution of the portion of the application program from the commutative error detection apparatus of the at least one node;

    saving the at least one commutative error detection value associated with the first execution of the portion of the application program to a computer memory medium;

    retrieving the at least one commutative error detection value generated during a second execution of the portion of the application program from the commutative error detection apparatus of the at least one node; and

    comparing the at least one commutative error detection value from the first execution of the portion of the application program to the at least one commutative error detection value from the second execution of the portion of the application program.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×