Cross-datacenter extension of grid encoded data storage systems
First Claim
1. A computer-implemented method, comprising:
- generating a grid of shards, the grid of shards forming a partitioned data set, the grid of shards being indexed by row and column, the grid of shards comprising a set of data shards and a set of derived shards, the set of derived shards comprising a set of horizontally derived shards and a set of vertically-derived shards, wherein;
the partitioned data set includes one or more shards containing one or more portions of the partitioned data set and one or more shards generated by applying an erasure code to the one or more shards containing the one or more portions of the partitioned data set; and
each shard of the grid of shards has a corresponding row and corresponding column and is configured such that;
the shard is reproducible from other shards associated with the row and reproducible from other shards associated with the column;
if the shard is a horizontally-derived shard of the set of horizontally-derived shards, the shard is derived based at least in part on a set of data shards associated with the row; and
if the shard is a vertically-derived shard of the set of vertically derived shards, the shard is derived based at least in part on a set of shards associated with the column;
adding a new datacenter location to a set of datacenter locations by adding a set of null shards to the grid of shards, each null shard of the set of null shards having a corresponding row and a corresponding column, the corresponding column based at least in part on the new datacenter location; and
storing a data object at a storage location in the new datacenter location, the storage location selected based at least in part on a null shard of the set of null shards, the null shard having a corresponding first row and first column by at least;
converting the null shard to a data shard by allocating a storage device corresponding to the storage location;
storing the data object in the data shard using the allocated storage device;
updating a first set of derived shards associated with the first row, the first set of derived shards associated with the first row having a shard that is associated with a second column;
converting each null shard of a first set of null shards associated with the first column to a corresponding derived shard by allocating a corresponding set of storage devices;
updating a second set of derived shards associated with the first column; and
updating a third set of derived shards associated with the second column.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for extending a grid encoded data storage system to additional datacenters are described herein. A grid of shards with a first index and a second index is created and a set of null shards is added to the grid of shards. When a data object is received for storage in the grid of shards, a set of shards with the same first index is selected for the storage location with at least one null shard and one or more other shards. The null shard is enabled for data storage by allocating a storage device for the null shard. The grid is then updated by storing at least a portion of the data object in the set of shards, updating derived shards in the set of shards, and updating derived shards with the same second index as the updated shards.
-
Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
generating a grid of shards, the grid of shards forming a partitioned data set, the grid of shards being indexed by row and column, the grid of shards comprising a set of data shards and a set of derived shards, the set of derived shards comprising a set of horizontally derived shards and a set of vertically-derived shards, wherein; the partitioned data set includes one or more shards containing one or more portions of the partitioned data set and one or more shards generated by applying an erasure code to the one or more shards containing the one or more portions of the partitioned data set; and each shard of the grid of shards has a corresponding row and corresponding column and is configured such that; the shard is reproducible from other shards associated with the row and reproducible from other shards associated with the column; if the shard is a horizontally-derived shard of the set of horizontally-derived shards, the shard is derived based at least in part on a set of data shards associated with the row; and if the shard is a vertically-derived shard of the set of vertically derived shards, the shard is derived based at least in part on a set of shards associated with the column; adding a new datacenter location to a set of datacenter locations by adding a set of null shards to the grid of shards, each null shard of the set of null shards having a corresponding row and a corresponding column, the corresponding column based at least in part on the new datacenter location; and storing a data object at a storage location in the new datacenter location, the storage location selected based at least in part on a null shard of the set of null shards, the null shard having a corresponding first row and first column by at least; converting the null shard to a data shard by allocating a storage device corresponding to the storage location; storing the data object in the data shard using the allocated storage device; updating a first set of derived shards associated with the first row, the first set of derived shards associated with the first row having a shard that is associated with a second column; converting each null shard of a first set of null shards associated with the first column to a corresponding derived shard by allocating a corresponding set of storage devices; updating a second set of derived shards associated with the first column; and updating a third set of derived shards associated with the second column. - View Dependent Claims (2, 3, 4)
-
-
5. A system, comprising at least one computing device configured to implement one or more services, wherein the one or more services are configured to:
-
generate a grid of shards, the grid of shards indexed at least by a first index and a second index, the grid of shards comprising a set of data shards and a set of derived shards, wherein each shard of the grid of shards has a corresponding first index, a corresponding second index, and a storage device that stores the shard, the storage device located in a datacenter location, each shard configured such that the shard is reproducible from other shards associated with the first index and the shard is reproducible from other shards associated with the second index; add a set of null shards to the grid of shards, each null shard of the set of null shards having a corresponding first index and a corresponding second index, the corresponding second index based at least in part on a new datacenter location; convert a first null shard of the set of null shards to a data shard by at least allocating a storage device selected based at least in part on the first null shard; store at least a part of a data object in the storage device; convert a subset of the set of null shards to a corresponding first subset of the set of derived shards by at least allocating a corresponding set of storage devices, the subset of the set of null shards selected based at least in part on the second index of the first null shard; and update a second subset of the set of derived shards, the second subset of the set of derived shards selected based at least in part on the first index of the first null shard and based at least in part on the second index of the first null shard. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium having stored thereon executable instructions that, when executed by one or more processors of a computer system, cause the computer system to at least:
-
using a shard of a grid of shards, the grid of shards indexed by at least a first index and a second index, wherein each shard of the grid of shards has at least a corresponding first index and a corresponding second index, the grid of shards configured such that each shard of the grid of shards is reproducible from other shards associated with the first index and each shard of the grid of shards is reproducible from other shards associated with the second index, the grid of shards at least including a set of null shards, receive a data object for storage in the grid of shards; select a first set of shards, each shard of the first set of shard having the same first index, the first set of shards at least including a first null shard of the set of null shards and one or more other shards of the grid of shards; convert the first null shard to a non-null shard by at least allocating a storage device selected based at least in part on the first null shard; update a plurality of shards of the first set of shards based at least in part on storing at least a part of the data object in one or more data shards of the plurality of shards and updating one or more derived shards of the plurality of shards; and update a subset of a set of derived shards, the set of the set of derived shards selected based at least in part on having a second index corresponding to a second index of one or more of the plurality of shards. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification