×

Method for detecting and resolving a partition condition in a cluster

  • US 8,239,518 B2
  • Filed: 07/21/2005
  • Issued: 08/07/2012
  • Est. Priority Date: 12/06/2000
  • Status: Expired due to Term
First Claim
Patent Images

1. A computer program product for detecting and resolving a partition condition in a cluster of computers in a networked environment, the computer program product stored on a non-transitory computer-readable medium and comprising:

  • computer-executable instructions for creating a scratch pad area accessible by the cluster of computers;

    computer-executable instructions for dividing the scratch pad into a plurality of slots, each slot associated with one of a plurality of nodes within the cluster of computers, wherein each slot includes at least a heartbeat field indicating that cluster software is loaded on the node and a node state field indicating a current state of the node, wherein the current state identifies the node as being dead, alive, or preparing to shut down;

    computer-executable instructions for recording in the plurality of slots, a generation number and a list of known nodes by each one of the plurality of nodes, wherein an identifier is written in the list for each node that is known to a writing node and wherein the generation number and the list of known nodes is recorded when a change of membership occurs in the cluster of computers;

    computer-executable instructions for comparing each slot of the plurality of slots to ensure the generation number and the list of known nodes matches in each slot of the plurality of slots;

    computer-executable instructions for resolving the partition condition by creating a list of surviving nodes and re-allocating appropriate resources to each of the surviving nodes,computer-executable instructions requiring each node not on the list of surviving nodes to re-register with the cluster of computers; and

    wherein the computer-executable instructions for comparing each slot include computer-executable instructions for finding a list with a master node to create the list of surviving nodes and shutting down each node not on the list with the master node.

View all claims
  • 12 Assignments
Timeline View
Assignment View
    ×
    ×