Systems and methods for providing a quiescing protocol
First Claim
Patent Images
1. A distributed system configured to quiesce a set of messages, the distributed system comprising:
- a plurality of nodes, each node comprising one or more physical processors;
a first subset of two or more of the plurality of nodes, each node of the first subset further comprising a participant process;
a second subset of one or more of the plurality of nodes, each node of the second subset further comprising a coordinator process; and
a set of messages sent and received by the plurality of nodes, the set of messages comprising;
a relevant message which changes a state of the distributed system;
a probe message which requests a probe-response message;
the probe-response message which indicates that the sender has processed all received relevant messages from the recipient;
a checkpoint message which indicates that the sender has received a probe-response message from each of the plurality of nodes;
a continue message requesting a continue-response message; and
the continue-response message which indicates whether the sender has received a relevant message from one or more of the plurality of nodes;
wherein each participant process is configured to, when executed by at least one node of the first subset;
suspend generation of relevant messages;
maintain received-message information which indicates whether a relevant message has been received from the plurality of nodes;
send probe messages to each of the plurality of nodes;
receive probe-response messages from each of the plurality of nodes;
receive probe messages from each of the plurality of nodes; and
for each probe message received, send the probe-response message to the node which sent the probe message; and
wherein each coordinator process is configured to, when executed by at least one node of the second subset;
receive checkpoint messages from each of the plurality of nodes;
send continue messages to each of the plurality of nodes;
receive continue-response messages from each of the plurality of nodes; and
based on the received continue-response messages, determine whether the distributed system has been quiesced.
12 Assignments
0 Petitions
Accused Products
Abstract
The systems and methods of the present invention provide a quiescing protocol. In one embodiment, nodes of a system utilize the protocol to complete processing until they reach a consistent state. In one embodiment, a coordinator initiates the quiescing process and the nodes communicate with each other to determine whether their messages have been processed and communicate with the coordinator to determine when all of the messages have been processed.
-
Citations
18 Claims
-
1. A distributed system configured to quiesce a set of messages, the distributed system comprising:
-
a plurality of nodes, each node comprising one or more physical processors; a first subset of two or more of the plurality of nodes, each node of the first subset further comprising a participant process; a second subset of one or more of the plurality of nodes, each node of the second subset further comprising a coordinator process; and a set of messages sent and received by the plurality of nodes, the set of messages comprising; a relevant message which changes a state of the distributed system; a probe message which requests a probe-response message; the probe-response message which indicates that the sender has processed all received relevant messages from the recipient; a checkpoint message which indicates that the sender has received a probe-response message from each of the plurality of nodes; a continue message requesting a continue-response message; and the continue-response message which indicates whether the sender has received a relevant message from one or more of the plurality of nodes; wherein each participant process is configured to, when executed by at least one node of the first subset; suspend generation of relevant messages; maintain received-message information which indicates whether a relevant message has been received from the plurality of nodes; send probe messages to each of the plurality of nodes; receive probe-response messages from each of the plurality of nodes; receive probe messages from each of the plurality of nodes; and for each probe message received, send the probe-response message to the node which sent the probe message; and wherein each coordinator process is configured to, when executed by at least one node of the second subset; receive checkpoint messages from each of the plurality of nodes; send continue messages to each of the plurality of nodes; receive continue-response messages from each of the plurality of nodes; and based on the received continue-response messages, determine whether the distributed system has been quiesced. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A distributed system configured to quiesce a set of messages, the distributed system comprising:
-
a plurality of nodes, each node comprising at least one physical processor; and one or more executable coordinator processes, each coordinator process configured to, when executed by one or more of the plurality of nodes; receive one or more first messages from one or more of the plurality of nodes, each first message indicating that the node, which has sent that first message, has sent a second message to each of the plurality of nodes and has received a third message from each of the plurality of nodes, wherein the second message is a message requesting the third message, and wherein the third message indicates that all messages that change a state of the distributed system previously received by the node, which has received that second message, from the node, which has sent that second message, have been processed; in response to receiving the one or more first messages, send fourth messages to the plurality of nodes, wherein each fourth message is a message requesting a fifth message; receive one or more fifth messages from one or more of the plurality of nodes in response to the fourth messages, each fifth message indicating whether the node, which has sent that fifth message, has received a message that changes a state of the distributed system; and based on one or more received fifth messages, determine whether the distributed system has been quiesced by determining whether any of the plurality of nodes have received any new messages that change a state of the distributed system; and when it is determined that any of the plurality of nodes received a new message that changes a state of the distributed system, sending and receiving additional messages until it is determined from one or more received messages that none of the plurality of nodes have received any new messages that change a state of the distributed system. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A distributed system configured to quiesce a set of messages, the distributed system comprising:
-
a plurality of nodes, each node comprising at least one physical processor and at least one executable software module; wherein the at least one executable software module of each of the plurality of nodes is configured to, when executed by the at least one physical processor; suspend generation of new messages that change a state of the distributed system; maintain received-message information which indicates whether a message that changes a state of the distributed system has been received from the plurality of nodes; send first messages to the plurality of nodes, each first message requesting a response; receive one or more second messages from one or more of the plurality of nodes, each second message indicating that all messages which change a state of the distributed system sent by the node, which received the second message, to the node, which sent the second message, have been processed; receive one or more third messages from one or more of the plurality of nodes, each third message requesting a response; for each third message received, send a fourth message to the node, which sent the third message, each fourth message indicating that all messages which change a state of the distributed system sent by the node, which sent the third message, to the node, which received the third message, have been processed; determine whether the distributed system has been quiesced at least in part by determining whether any of the plurality of nodes have received any new messages that change a state of the distributed system; and when it is determined that any of the plurality of nodes received a new message that changes a state of the distributed system, sending and receiving additional messages until it is determined that none of the plurality of nodes have received any new messages that change a state of the distributed system. - View Dependent Claims (15, 16, 17, 18)
-
Specification