SYSTEMS AND METHODS FOR PROVIDING A QUIESCING PROTOCOL
First Claim
Patent Images
1. A distributed system configured to quiesce a set of messages, the distributed system comprising:
- a plurality of nodes, each node comprising one or more processors;
a first subset of two or more of the plurality of nodes, each node of the first subset further comprising a participant process;
a second subset of one or more of the plurality of nodes, each node of the second subset further comprising a coordinator process; and
a set of messages sent and received by the plurality of nodes, the set of messages comprising;
a relevant message which changes a state of the distributed system;
a probe message which requests a probe-response message;
the probe-response message which indicates that the sender has processed all received relevant messages from the recipient;
a checkpoint message which indicates that the sender has received a probe-response message from each of the plurality of nodes;
a continue message requesting a continue-response message; and
the continue-response message which indicates whether the sender has received a relevant message from one or more of the plurality of nodes;
wherein each participant process is configured to, when executed;
suspend generation of relevant messages;
maintain received-message information which indicates whether a relevant message has been received from the plurality of nodes;
send probe messages to each of the plurality of nodes;
receive probe-response messages from each of the plurality of nodes;
receive probe messages from each of the plurality of nodes; and
for each probe message received, send the probe-response message to the node which sent the probe message; and
wherein each coordinator process is configured to, when executed;
receive checkpoint messages from each of the plurality of nodes;
send continue messages to each of the plurality of nodes;
receive continue-response messages from each of the plurality of nodes; and
based on the received continue-response messages, determine whether the distributed system has been quiesced.
12 Assignments
0 Petitions
Accused Products
Abstract
The systems and methods of the present invention provide a quiescing protocol. In one embodiment, nodes of a system utilize the protocol to complete processing until they reach a consistent state. In one embodiment, a coordinator initiates the quiescing process and the nodes communicate with each other to determine whether their messages have been processed and communicate with the coordinator to determine when all of the messages have been processed.
150 Citations
22 Claims
-
1. A distributed system configured to quiesce a set of messages, the distributed system comprising:
-
a plurality of nodes, each node comprising one or more processors; a first subset of two or more of the plurality of nodes, each node of the first subset further comprising a participant process; a second subset of one or more of the plurality of nodes, each node of the second subset further comprising a coordinator process; and a set of messages sent and received by the plurality of nodes, the set of messages comprising; a relevant message which changes a state of the distributed system; a probe message which requests a probe-response message; the probe-response message which indicates that the sender has processed all received relevant messages from the recipient; a checkpoint message which indicates that the sender has received a probe-response message from each of the plurality of nodes; a continue message requesting a continue-response message; and the continue-response message which indicates whether the sender has received a relevant message from one or more of the plurality of nodes; wherein each participant process is configured to, when executed; suspend generation of relevant messages; maintain received-message information which indicates whether a relevant message has been received from the plurality of nodes; send probe messages to each of the plurality of nodes; receive probe-response messages from each of the plurality of nodes; receive probe messages from each of the plurality of nodes; and for each probe message received, send the probe-response message to the node which sent the probe message; and wherein each coordinator process is configured to, when executed; receive checkpoint messages from each of the plurality of nodes; send continue messages to each of the plurality of nodes; receive continue-response messages from each of the plurality of nodes; and based on the received continue-response messages, determine whether the distributed system has been quiesced. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A distributed system configured to quiesce a set of messages, the distributed system comprising:
-
a plurality of nodes, each node comprising a processor; and one or more executable coordinator processes, each coordinator process configured to, when executed; receive one or more first messages from one or more of the plurality of nodes, each first message indicating that the node, which has sent that first message, has sent a second message to each of the plurality of nodes and has received a third message from each of the plurality of nodes, wherein the second message is a message requesting the third message, and wherein the third message indicates that all messages that change a state of the distributed system previously received by the node, which has received that second message, from the node, which has sent that second message, have been processed; in response to receiving the one or more first messages, send fourth messages to the plurality of nodes, wherein each fourth message is a message requesting a fifth message; receive one or more fifth messages from one or more of the plurality of nodes in response to the fourth messages, each fifth message indicating whether the node, which has sent that fifth message, has received a message that changes a state of the distributed system; and based on one or more received fifth messages, determine whether the distributed system has been quiesced. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
-
-
15. A distributed system configured to quiesce a set of messages, the distributed system comprising:
-
a plurality of nodes, each node comprising a processor and at least one executable software module; wherein the at least one executable software module of each of the plurality of nodes is configured to, when executed; suspend generation of new messages that change a state of the distributed system; maintain received-message information which indicates whether a message that changes a state of the distributed system has been received from the plurality of nodes; send first messages to the plurality of nodes, each first message requesting a response; receive one or more second messages from one or more of the plurality of nodes, each second message indicating that all messages which change a state of the distributed system sent by the node, which received the second message, to the node, which sent the second message, have been processed; receive one or more third messages from one or more of the plurality of nodes, each third message requesting a response; and for each third message received, send a fourth message to the node, which sent the third message, each fourth message indicating that all messages which change a state of the distributed system sent by the node, which sent the third message, to the node, which received the third message, have been processed. - View Dependent Claims (16, 17, 18, 19)
-
-
20. A distributed system comprising:
-
a plurality of three or more nodes, each node comprising a processor and configured to act as a coordinator node and a participant node in a quiescing protocol; wherein, during operation of the quiescing protocol, the plurality of three or more nodes comprises at least one coordinator node and a plurality of two or more participant nodes; wherein the at least one coordinator node is configured to send and receive messages to and from each of the plurality of participant nodes to collect information about messages which the plurality of participant nodes have received; and wherein each participant node is configured to; send and receive messages to and from the plurality of participant nodes to determine whether messages which the participant sent to the plurality of participant nodes and which change a state of the distributed system have been processed; and send a message to the at least one coordinator node to communicate whether a message that changes a state of the distributed system has been received from one or more of the plurality of participant nodes during operation of the quiescing protocol. - View Dependent Claims (21, 22)
-
Specification