Distributed database management system with node failure detection
First Claim
1. A method for processing information obtained by a node failure detection system, the node failure detection system included in a distributed database, the distributed database comprising a plurality of nodes, the plurality of nodes comprising a leader node and a plurality of informer nodes, the method comprising:
- at each informer node in the plurality of informer nodes;
transmitting a ping message to each other node in the plurality of nodes;
monitoring responses to the ping message from each other node in the plurality of nodes; and
responding to an invalid response from a responding node in the plurality of nodes by designating the responding node as a suspicious node; and
transmitting a message to the leader node, the message comprising anidentification of the informer node and the suspicious node; and
at the leader node;
receiving the message comprising the identification of the informer node and the suspicious node;
determining a number of the plurality of informer nodes that received invalid responses from the suspicious node;
sending an acknowledgement message to the plurality of informer nodes if the number is fewer than a majority of the plurality of informer nodes; and
designating the suspicious node as failed if the majority of the plurality of informer nodes identify the suspicious node in a message or the majority of the plurality of informer nodes identify the suspicious node in response to the acknowledgment message.
2 Assignments
0 Petitions
Accused Products
Abstract
A node failure detector for use in a distributed database that is accessed through a plurality of interconnected transactional and archival nodes. Each node is selected as an informer node that tests communications with each other node. Each informer node generates a list of suspicious nodes that is resident in one node designated as a leader node. The leader node analyzes the data from all of the informer nodes to designate each node that should be designated for removal with appropriate failover procedures.
96 Citations
11 Claims
-
1. A method for processing information obtained by a node failure detection system, the node failure detection system included in a distributed database, the distributed database comprising a plurality of nodes, the plurality of nodes comprising a leader node and a plurality of informer nodes, the method comprising:
-
at each informer node in the plurality of informer nodes; transmitting a ping message to each other node in the plurality of nodes; monitoring responses to the ping message from each other node in the plurality of nodes; and responding to an invalid response from a responding node in the plurality of nodes by designating the responding node as a suspicious node; and transmitting a message to the leader node, the message comprising an identification of the informer node and the suspicious node; and at the leader node; receiving the message comprising the identification of the informer node and the suspicious node; determining a number of the plurality of informer nodes that received invalid responses from the suspicious node; sending an acknowledgement message to the plurality of informer nodes if the number is fewer than a majority of the plurality of informer nodes; and designating the suspicious node as failed if the majority of the plurality of informer nodes identify the suspicious node in a message or the majority of the plurality of informer nodes identify the suspicious node in response to the acknowledgment message. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for choosing a node in a distributed database to fail, the distributed database comprising a plurality of nodes, the plurality of nodes comprising a leader node and a plurality of informer nodes, the method comprising:
-
selecting a first informer node from the plurality of informer nodes; designating, by the first informer node, a first node in the plurality of nodes as a first suspicious node in response to an invalid response from the first node; determining if the first suspicious node is suspicious to only the first informer node; if the first suspicious node is suspicious to only the first informer node, designating the first suspicious node or the first informer node as disabled based on a higher node identification; and if the first suspicious node is suspicious to at least one other informer node in the plurality of informer nodes, designating all suspicious nodes identified by the first informer node as failed. - View Dependent Claims (9, 10, 11)
-
Specification