×

Inter-node communication scheme for sharing node operating status

  • US 9,553,789 B2
  • Filed: 06/25/2014
  • Issued: 01/24/2017
  • Est. Priority Date: 12/03/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method for determining node operating status among a cluster of nodes of a computer system, the method comprising:

  • first transmitting gossip messages directly between node pairs in the cluster of nodes, wherein the gossip messages contain an indication of operating status of other nodes in the cluster of nodes, wherein the other nodes are nodes other than the nodes in the node pairs;

    receiving the gossip messages at individual nodes of the node pairs;

    responsive to the receiving the gossip messages at the individual nodes, at the other nodes, locally updating a local database of operating status according to the received gossip messages, wherein the updating sets a value of a local operating status kept by the individual nodes for a particular one of the other nodes to a non-operational status if the receiving by the individual nodes has not received a gossip message from the particular one of the other nodes during a predetermined time period;

    responsive to the locally updating setting the local operating status of the particular one of the other nodes to a non-operational status, second transmitting a node down message separate from the gossip messages that indicates the non-operational status of the particular node to the other nodes in the cluster;

    at a first node other than the particular node, receiving the node down message;

    responsive to receiving the node down message, determining whether or not the first node has received a gossip message from the particular node during the predetermined time period;

    responsive to determining that the first node has received the gossip message from the particular node during the predetermined time period, transmitting a node alive message from the first node indicating that the status of the particular node is operational and setting the local operating status of the particular node at the first node to an operational status; and

    repeating the first transmitting, receiving, updating and second transmitting at each of the nodes in the node pairs, so that the local status kept by each of the nodes reflects the status of each of the other nodes in the cluster, wherein the first transmitting selectively transmits gossip messages containing an indication of operating status of other nodes depending on whether the operating status of the other nodes in the local database is set to non-operational, whereby the first transmitting halts gossip messaging for nodes marked non-operational.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×