Incremental media size extension for grid encoded data storage systems
First Claim
1. A computer-implemented method, comprising:
- under the control of one or more computer systems configured with executable instructions,generating a grid of shards, the grid of shards forming a partitioned data set, the grid of shards being indexed by row and column, the grid of shards comprising a set of data shards and a set of derived shards, the set of derived shards comprising a set of horizontally-derived shards and a set of vertically-derived shards, wherein;
the partitioned data set includes one or more shards containing one or more portions of the data set and one or more shards generated by applying an erasure code to the one or more shards containing the one or more portions of the data set;
each shard of the grid of shards has a corresponding row and corresponding column and is configured such that;
the shard is reproducible from other shards associated with the row and reproducible from other shards associated with the column;
if the shard is a horizontally-derived shard of the set of horizontally-derived shards, the shard is derived based at least in part on a set of data shards associated with the row; and
if the shard is a vertically-derived shard of the set of vertically-derived shards, the shard is derived based at least in part on a set of shards associated with the column;
each shard of the grid of shards having a corresponding first data size; and
each shard of the grid of shards has a corresponding storage device of a set of storage devices that stores the shard, each storage device of the set of storage devices having a first storage capacity sufficient for a maximum corresponding first data size of the grid of shards;
determining a second data size for the shards of the grid of shards;
padding each shard of the grid of shards with a corresponding number of zero values, the corresponding number of zero values determined based at least in part on subtracting the corresponding first data size from the second data size;
replacing a first storage device of the set of storage devices that stores a first data shard of the grid of shards with a first replacement storage device that stores the first data shard, the first replacement storage device having a second storage capacity sufficient for the second data size, the first data shard having a corresponding first row and first column;
replacing a first subset of the set of storage devices with a first set of replacement storage devices having a storage capacity sufficient for the second data size, each replacement storage device of the first set of replacement storage devices configured to store a corresponding derived shard of a first set of derived shards associated with the first row, the first set of derived shards associated with the first row having a shard that is associated with a second column;
replacing a second subset of the set of storage devices with a second set of replacement storage devices having a storage capacity sufficient for the second data size, each replacement storage device of the second set of replacement storage devices configured to store a corresponding derived shard of a second set of derived shards associated with the first column; and
replacing a third subset of the set of storage devices with a third set of replacement storage devices having a storage capacity sufficient for the second data size, each replacement storage device of the third set of replacement storage devices configured to store a corresponding derived shard of a third set of derived shards associated with the second column.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for incrementally increasing media size in data storage systems using grid encoded data storage techniques are described herein. A grid of shards is created where each shard of the grid of shards has a first index, a second index and each shard also has an associated storage device configured with a storage capacity that is large enough to store the largest set of data on a shard. Upon determining to replace the storage devices of the grid with storage devices that have a different storage capacity, the storage devices can be incrementally replaced within the grid by first padding each shard of the grid of shards with a set of data values, replacing a data shard storage device with a device of the different storage capacity, and replacing a set of derived shard storage devices with devices of the different storage capacity.
223 Citations
20 Claims
-
1. A computer-implemented method, comprising:
under the control of one or more computer systems configured with executable instructions, generating a grid of shards, the grid of shards forming a partitioned data set, the grid of shards being indexed by row and column, the grid of shards comprising a set of data shards and a set of derived shards, the set of derived shards comprising a set of horizontally-derived shards and a set of vertically-derived shards, wherein; the partitioned data set includes one or more shards containing one or more portions of the data set and one or more shards generated by applying an erasure code to the one or more shards containing the one or more portions of the data set; each shard of the grid of shards has a corresponding row and corresponding column and is configured such that; the shard is reproducible from other shards associated with the row and reproducible from other shards associated with the column; if the shard is a horizontally-derived shard of the set of horizontally-derived shards, the shard is derived based at least in part on a set of data shards associated with the row; and if the shard is a vertically-derived shard of the set of vertically-derived shards, the shard is derived based at least in part on a set of shards associated with the column; each shard of the grid of shards having a corresponding first data size; and each shard of the grid of shards has a corresponding storage device of a set of storage devices that stores the shard, each storage device of the set of storage devices having a first storage capacity sufficient for a maximum corresponding first data size of the grid of shards; determining a second data size for the shards of the grid of shards; padding each shard of the grid of shards with a corresponding number of zero values, the corresponding number of zero values determined based at least in part on subtracting the corresponding first data size from the second data size; replacing a first storage device of the set of storage devices that stores a first data shard of the grid of shards with a first replacement storage device that stores the first data shard, the first replacement storage device having a second storage capacity sufficient for the second data size, the first data shard having a corresponding first row and first column; replacing a first subset of the set of storage devices with a first set of replacement storage devices having a storage capacity sufficient for the second data size, each replacement storage device of the first set of replacement storage devices configured to store a corresponding derived shard of a first set of derived shards associated with the first row, the first set of derived shards associated with the first row having a shard that is associated with a second column; replacing a second subset of the set of storage devices with a second set of replacement storage devices having a storage capacity sufficient for the second data size, each replacement storage device of the second set of replacement storage devices configured to store a corresponding derived shard of a second set of derived shards associated with the first column; and replacing a third subset of the set of storage devices with a third set of replacement storage devices having a storage capacity sufficient for the second data size, each replacement storage device of the third set of replacement storage devices configured to store a corresponding derived shard of a third set of derived shards associated with the second column. - View Dependent Claims (2, 3, 4)
-
5. A system, comprising at least one computing device configured to implement one or more services, wherein the one or more services are configured to:
-
generate a grid of shards, the grid of shards indexed at least by a first index and a second index, the grid of shards comprising a set of data shards and a set of derived shards, wherein each shard of the grid of shards has a corresponding first index, a corresponding second index, and a corresponding storage device having a first storage capacity sufficient to store a first maximum data amount corresponding to a first shard size associated with the grid of shards and configured to store the shard, each shard configured such that the shard is reproducible from other shards associated with the first index and the shard is reproducible from other shards associated with the second index; determine a second maximum data amount corresponding to a second shard size associated with the grid of shards; pad each shard of the grid of shards with an amount of data that is based at least in part on a difference between the first storage capacity and a second storage capacity, the second storage capacity sufficient to store the second maximum data amount, the amount of data including a set of data values; and if a selected storage device corresponding to a first shard of the grid of shards is replaced with a replacement storage device having a storage capacity sufficient for the second storage capacity, replace a first set of storage devices with a second set of storage devices having a storage capacity sufficient for the second storage capacity, the storage devices in the first set of storage devices selected based at least in part on the first index of the first shard and based at least in part on the second index of the first shard. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium having stored thereon executable instructions that, when executed by one or more processors of a computer system, cause the computer system to at least:
-
use a grid of shards, the grid of shards indexed by at least a first index and a second index, the grid of shards comprising a set of data shards and a set of derived shards, wherein each shard of the grid of shards has a corresponding first index, a corresponding second index, and a corresponding storage device, the grid of shards configured such that each shard of the grid of shards is reproducible from other shards associated with the first index and each shard of the grid of shards is reproducible from other shards associated with the second index, each corresponding storage device having a first storage capacity sufficient to store a first maximum data amount corresponding to a first shard size associated with the grid of shards, determine a second maximum data amount corresponding to a second shard size associated with the grid of shards and pad each shard with an amount of data that is based at least in part on a difference between the first maximum data amount and the second maximum data amount, the amount of data including a set of data values; replace a first storage device configured to store a first shard with a first replacement storage device configured to store the first shard, the first shard having a corresponding first index and a corresponding second index; and replace each storage device of a set of storage devices with a corresponding replacement storage device, each storage device of the set of storage devices corresponding to a derived shard in a set of derived shards, the set of derived shards including; a first subset of derived shards associated with the corresponding first index, the first subset having a second shard that is associated with a second index that differs from the corresponding second index; a second subset of derived shards associated with the corresponding second index; and a third subset of derived shards associated with the second index that differs from the corresponding second index. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification