Scalable transport with client-consensus rendezvous
First Claim
1. A method of putting a chunk of payload data in a cluster of storage servers using unreliable datagrams, the method comprising:
- performing a cryptographic hash of the chunk to generate a content hash identifier for the chunk;
selecting a negotiating group for the chunk by mapping the content hash identifier to a distributed hash allocation table;
multicasting a put proposal from an initiating client to the storage servers in the cluster that are in the negotiating group for the chunk;
in response to the put proposal, unicasting a put accept response from each of the storage servers in the negotiating group to the initiating client;
evaluating the put accept responses by the initiating client to determine the storage servers in the negotiating group that are members of a rendezvous group; and
multicasting the payload data of the chunk from the initiating client to the storage servers that are members of the rendezvous group so as to perform a rendezvous transfer.
4 Assignments
0 Petitions
Accused Products
Abstract
Embodiments disclosed herein provide advantageous methods and systems that use multicast communications via unreliable datagrams sent on a protected traffic class. These methods and systems provide effectively reliable multicast delivery while avoiding the overhead associated with point-to-point protocols. Rather than an exponential scaling of point-to-point connections (with expensive setup and teardown of the connections), the traffic from one server is bounded by linear scaling of multicast groups. In addition, the multicast rendezvous disclosed herein creates an edge-managed flow control that accounts for the dynamic state of the storage servers in the cluster, without needing centralized control, management or maintenance of state. This traffic shaping avoids the loss of data due to congestion during sustained oversubscription. Other embodiments, aspects and features are also disclosed.
32 Citations
21 Claims
-
1. A method of putting a chunk of payload data in a cluster of storage servers using unreliable datagrams, the method comprising:
-
performing a cryptographic hash of the chunk to generate a content hash identifier for the chunk; selecting a negotiating group for the chunk by mapping the content hash identifier to a distributed hash allocation table; multicasting a put proposal from an initiating client to the storage servers in the cluster that are in the negotiating group for the chunk; in response to the put proposal, unicasting a put accept response from each of the storage servers in the negotiating group to the initiating client; evaluating the put accept responses by the initiating client to determine the storage servers in the negotiating group that are members of a rendezvous group; and multicasting the payload data of the chunk from the initiating client to the storage servers that are members of the rendezvous group so as to perform a rendezvous transfer. - View Dependent Claims (2, 3, 4, 18)
-
-
5. A method of getting a chunk of payload data from a cluster of storage servers using unreliable datagrams, the method comprising:
-
performing a cryptographic hash of the chunk to generate a content hash identifier for the chunk; selecting a negotiating group for the chunk by mapping the content hash identifier to a distributed hash allocation table; multicasting a get request from an initiating client to the storage servers in the cluster that are in the negotiating group for the chunk; in response to the get request, unicasting a get response from each of the storage servers in the negotiating group to the initiating client; evaluating the get responses by the initiating client to determine a designated storage server that is to perform a rendezvous transfer to a rendezvous group; multicasting a get accept from the initiating client to the negotiating group, wherein the get accept indicates the designated storage server; and performing the rendezvous transfer by sending the payload data of the chunk from the designated storage server to the rendezvous group. - View Dependent Claims (6, 7, 19, 20, 21)
-
-
8. A system that stores a chunk of payload data using unreliable datagrams, the system comprising:
-
a cluster of storage servers; a network of non-blocking switches communicatively interconnecting the storage servers of the cluster; and an initiating client comprising a client system that communicatively interconnects to the network, wherein the initiating client multicasts a put proposal to the storage servers of the cluster that are in a negotiating group for the chunk, the negotiating group being selected by mapping a content hash identifier to a distributed hash allocation table, wherein the storage servers in the negotiating group unicast put accept responses to the initiating client, wherein the initiating client evaluates the put accept responses to determine the storage servers in the negotiating group that are members of a rendezvous group, and wherein the initiating client performs a rendezvous transfer by multicasting the payload data of the chunk to the storage servers that are members of the rendezvous group. - View Dependent Claims (9, 10, 11, 17)
-
-
12. A system that retrieves a chunk of payload data using unreliable datagrams, the system comprising:
-
a cluster of storage servers; a network of non-blocking switches communicatively interconnecting the storage servers of the cluster; and an initiating client comprising a client system that communicatively interconnects to the network, wherein the initiating client multicasts a get request to the storage servers in the cluster that are in a negotiating group for the chunk, the negotiating group being selected by mapping a content hash identifier to a distributed hash allocation table, wherein, in response to the get request, a get response is unicast from each of the storage servers in the negotiating group to the initiating client, wherein the get responses are evaluated by the initiating client to determine a designated storage server that is to perform a rendezvous transfer to a rendezvous group, wherein a get accept is multicast from the initiating client to the negotiating group, wherein the get accept indicates the designated storage server, and wherein the rendezvous transfer is performed by the designated server sending the payload data of the chunk to the rendezvous group. - View Dependent Claims (13, 14)
-
-
15. A method of storing a chunk within a cluster of storage servers using unreliable datagrams, the method comprising:
-
multicasting a put proposal by an initiating client to storage servers of a negotiating group, wherein the negotiating group is determined by mapping a content hash identifier to a distributed hash allocation table; receiving unicast proposal responses from the storage servers in the negotiating group by the initiating client; determining a rendezvous group of storage servers by the initiating client using the unicast proposal responses; and multicasting a payload of the chunk by the initiating client to the rendezvous group.
-
-
16. A method of storing a chunk within a cluster of storage servers using unreliable datagrams, the method comprising:
-
receiving a multicast put proposal from an initiating client by a storage server of a negotiating group, wherein the negotiating group is determined by mapping a content hash identifier to a distributed hash allocation table; unicasting a proposal response from the storage server in the negotiating group to the initiating client in response to the multicast put proposal; and receiving a multicast payload of the chunk that is multicast from the initiating client to a rendezvous group that includes the storage server, wherein the rendezvous group is determine by the initiating client.
-
Specification