Methods and systems for key sharding of objects stored in distributed storage system
First Claim
1. A method of performing a delta edit of a named object stored in a distributed storage system, the method comprising:
- storing, in the distributed storage system, a payload of the named object in key shards that are defined by key-shard chunk references, wherein the payload for the named object comprises a collection of key-value records, and wherein referenced chunks identified by the key shards each stores a subset of the collection of the key-value records, where the key-value records in the subset have key hashes that have a range of matching bits in common;
receiving, by a gateway server, a request for a set of delta edits to be applied to the named object, wherein each delta edit specifies addition or deletion of at least one key-value record;
determining, by the gateway server, relevant key shards to which the delta edits apply; and
updating the relevant key shards, while not updating other key shards for the named object,wherein updating the relevant key shards comprises;
obtaining a name hash identifying token for the named object;
generating a plurality of key hash identifying tokens corresponding to the plurality of key-value records;
determining negotiating groups for the relevant key shards using the name hash identifying token and the plurality of key hash identifying tokens;
multicasting, by the gateway server, put requests to the negotiating groups for the relevant key shards, wherein the multicast put request specifies a cryptographic hash of a referenced chunk that is to be used as a base for a new chunk to be created;
collecting, by the gateway server, responses from the negotiating groups;
selecting rendezvous groups based on the responses from the negotiating groups;
initiating rendezvous transfers to the rendezvous groups; and
each addressed target server in a rendezvous group applying a delta edit to a copy of the referenced chunk to create a new chunk and calculating a content hash identifying token for the new chunk.
4 Assignments
0 Petitions
Accused Products
Abstract
The present disclosure also provides systems and methods for sharding objects stored in a distributed storage system. In accordance with one embodiment disclosed herein, a key sharding technique is used. Key sharding is an advantageously efficient technique when dealing with an object containing a collection of key-value records. In accordance with an embodiment of the invention, referenced chunks identified by the key shards may each store a subset of the collection of the key-value records, and the key-value records in the subset have key hashes that have a range of matching bits in common. One embodiment disclosed herein provides a method of performing a delta edit of a named object stored in a distributed storage system in which a payload of the named object is stored in key shards. Other embodiments, aspects and features are also disclosed.
-
Citations
9 Claims
-
1. A method of performing a delta edit of a named object stored in a distributed storage system, the method comprising:
-
storing, in the distributed storage system, a payload of the named object in key shards that are defined by key-shard chunk references, wherein the payload for the named object comprises a collection of key-value records, and wherein referenced chunks identified by the key shards each stores a subset of the collection of the key-value records, where the key-value records in the subset have key hashes that have a range of matching bits in common; receiving, by a gateway server, a request for a set of delta edits to be applied to the named object, wherein each delta edit specifies addition or deletion of at least one key-value record; determining, by the gateway server, relevant key shards to which the delta edits apply; and updating the relevant key shards, while not updating other key shards for the named object, wherein updating the relevant key shards comprises; obtaining a name hash identifying token for the named object; generating a plurality of key hash identifying tokens corresponding to the plurality of key-value records; determining negotiating groups for the relevant key shards using the name hash identifying token and the plurality of key hash identifying tokens; multicasting, by the gateway server, put requests to the negotiating groups for the relevant key shards, wherein the multicast put request specifies a cryptographic hash of a referenced chunk that is to be used as a base for a new chunk to be created; collecting, by the gateway server, responses from the negotiating groups; selecting rendezvous groups based on the responses from the negotiating groups; initiating rendezvous transfers to the rendezvous groups; and each addressed target server in a rendezvous group applying a delta edit to a copy of the referenced chunk to create a new chunk and calculating a content hash identifying token for the new chunk. - View Dependent Claims (2, 3)
-
-
4. A distributed storage system, the system comprising:
-
a network; a plurality of storage servers interconnected by the network; and a plurality of storage servers interconnected by the network, wherein the plurality of storage servers includes a gateway server; wherein the plurality of storage servers store a payload of the named object in key shards that are defined by key-shard chunk references, wherein the payload for any version of the named object comprises a collection of key-value records, wherein each key shard stores a subset of the collection of the key-value records in a referenced chunk, where the key-value records in the subset have key hashes that have a range of matching bits in common, and wherein the system is configured to perform steps including; receiving a delta-edit request for the named object, wherein the delta-edit request specifies changes to a plurality of key-value records of the payload of the named object; determining relevant key shards to which the changes apply; and updating the relevant key shards, while not updating other key shards for the named object, wherein updating the relevant key shards comprises; obtaining a name hash identifying token for the named object; generating a plurality of key hash identifying tokens corresponding to the plurality of key-value records; determining negotiating groups for the relevant key shards using the name hash identifying token and the plurality of key hash identifying tokens; multicasting put requests to the negotiating groups for the relevant key shards, wherein the multicast put request specifies a cryptographic hash of a referenced chunk that is to be used as a base for a new chunk to be created; collecting responses from the negotiating groups; selecting rendezvous groups based on the responses from the negotiating groups; initiating rendezvous transfers to the rendezvous groups; and each addressed target server in a rendezvous group applying a delta edit to a copy of the referenced chunk to create a new chunk and calculating a content hash identifying token for the new chunk. - View Dependent Claims (5, 6)
-
-
7. A non-transitory computer-readable medium comprising instructions stored thereon, that when executed by one or more processors at a plurality of servers within a distributed storage system, perform the steps of:
-
storing, in the distributed storage system, a payload of the named object in key shards that are defined by key-shard chunk references, wherein the payload for the named object comprises a collection of key-value records, and wherein referenced chunks identified by the key shards each stores a subset of the collection of the key-value records, where the key-value records in the subset have key hashes that have a range of matching bits in common; receiving, by a gateway server, a request for a set of delta edits to be applied to the named object, wherein each delta edit specifies addition or deletion of at least one key-value record; determining, by the gateway server, relevant key shards to which the delta edits apply; and updating the relevant key shards, while not updating other key shards for the named object, wherein updating the relevant key shards comprises; obtaining a name hash identifying token for the named object; generating a plurality of key hash identifying tokens corresponding to the plurality of key-value records; determining negotiating groups for the relevant key shards using the name hash identifying token and the plurality of key hash identifying tokens; multicasting, by the gateway server, put requests to the negotiating groups for the relevant key shards, wherein the multicast put request specifies a cryptographic hash of a referenced chunk that is to be used as a base for a new chunk to be created; collecting, by the gateway server, responses from the negotiating groups; selecting rendezvous groups based on the responses from the negotiating groups; initiating rendezvous transfers to the rendezvous groups; and each addressed target server in a rendezvous group applying a delta edit to a copy of the referenced chunk to create a new chunk and calculating a content hash identifying token for the new chunk. - View Dependent Claims (8, 9)
-
Specification