Cycled clustering for redundancy coded data storage systems
First Claim
1. A computer-implemented method, comprising:
- under the control of one or more computer systems configured with executable instructions,configuring a data storage system to at least;
apportion at least a first bundle of redundancy coded shards and a second bundle of redundancy coded shards between a plurality of data transfer devices provisioned by the data storage system to be capable of processing data storage requests and data retrieval requests without a network connection between the plurality of data transfer devices and the data storage system, the first bundle including at least a first identity shard, a second identity shard, and a first derived shard, the first bundle being configured such that a first quorum quantity of shards of the first bundle is sufficient to reconstruct, using a redundancy code, original data associated with the first identity shard, the second bundle including the second identity shard, a second derived shard, and a third identity shard, the second bundle being configured such that a second quorum quantity of shards of the second bundle is sufficient to reconstruct, using the redundancy code, the second identity shard, the first bundle and second bundle overlapping by virtue of both including the second identity shard; and
configure a fill pattern such that the first identity shard, the second identity shard, and the third identity shard are subject to receiving data for storage in a specified order comprising, sequentially, the first identity shard, the second identity shard, and the third identity shard;
monitoring the plurality of data transfer devices to detect an event associated with the first identity shard that indicates an inability to accept additional data; and
if the event is detected, at least;
configuring any data storage requests to store associated data in the second identity shard;
initiating an ingestion process of the data storage system to transfer, by a data transfer device of the plurality of data transfer devices, data associated with the first identity shard to durable storage of the data storage system;
verifying that the data associated with the first identity shard is durably stored in the data storage system; and
if verified that the data associated with the first identity shard is durably stored, at least;
deleting the first identity shard and the first derived shard;
generating a third bundle comprising a fourth identity shard, the third identity shard, and a third derived shard, the third bundle overlapping with the second bundle by virtue of sharing the third identity shard; and
adding the fourth identity shard to the specified order of the fill pattern after the third identity shard.
1 Assignment
0 Petitions
Accused Products
Abstract
A cluster of data transfer devices is used to augment the capabilities of a data storage system. For example, the cluster of data transfer devices may be configured to store a portion of a bundle of redundancy coded shards in a similar fashion as a data storage system. As another example, the cluster may be configured to provide other capabilities incident to the devices used, such as computational capabilities. Data stored on the cluster may be read from and written directly to the cluster without transfer of data to the data storage system. In some embodiments, a connecting entity (such as a customer entity) may interchangeably interface with the data storage system and the cluster, and the requested capabilities may be directed to either in a fashion that is transparent to the requestor.
-
Citations
20 Claims
-
1. A computer-implemented method, comprising:
under the control of one or more computer systems configured with executable instructions, configuring a data storage system to at least; apportion at least a first bundle of redundancy coded shards and a second bundle of redundancy coded shards between a plurality of data transfer devices provisioned by the data storage system to be capable of processing data storage requests and data retrieval requests without a network connection between the plurality of data transfer devices and the data storage system, the first bundle including at least a first identity shard, a second identity shard, and a first derived shard, the first bundle being configured such that a first quorum quantity of shards of the first bundle is sufficient to reconstruct, using a redundancy code, original data associated with the first identity shard, the second bundle including the second identity shard, a second derived shard, and a third identity shard, the second bundle being configured such that a second quorum quantity of shards of the second bundle is sufficient to reconstruct, using the redundancy code, the second identity shard, the first bundle and second bundle overlapping by virtue of both including the second identity shard; and configure a fill pattern such that the first identity shard, the second identity shard, and the third identity shard are subject to receiving data for storage in a specified order comprising, sequentially, the first identity shard, the second identity shard, and the third identity shard; monitoring the plurality of data transfer devices to detect an event associated with the first identity shard that indicates an inability to accept additional data; and if the event is detected, at least; configuring any data storage requests to store associated data in the second identity shard; initiating an ingestion process of the data storage system to transfer, by a data transfer device of the plurality of data transfer devices, data associated with the first identity shard to durable storage of the data storage system; verifying that the data associated with the first identity shard is durably stored in the data storage system; and if verified that the data associated with the first identity shard is durably stored, at least; deleting the first identity shard and the first derived shard; generating a third bundle comprising a fourth identity shard, the third identity shard, and a third derived shard, the third bundle overlapping with the second bundle by virtue of sharing the third identity shard; and adding the fourth identity shard to the specified order of the fill pattern after the third identity shard. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
10. A system, comprising at least one computing device configured to implement one or more services, wherein the one or more services are configured to:
-
provision a first plurality of data transfer devices to store at least a first bundle of redundancy coded shards and a second bundle of redundancy coded shards, the first bundle including at least a first identity shard, a second identity shard, and a first derived shard, the first bundle being configured such that a first quorum quantity of shards of the first bundle is sufficient to reconstruct, using a redundancy code, original data associated with the first identity shard, the second bundle including the second identity shard, a second derived shard, and a third identity shard, the second bundle being configured such that a second quorum quantity of shards of the second bundle is sufficient to reconstruct, using the redundancy code, the second identity shard, the first bundle and second bundle overlapping by virtue of both including the second identity shard; monitor data transfer devices associated with the first bundle for an event associated with the first identity shard; and if the event is detected, at least; store data associated with data storage requests received after the event on a data transfer device associated with the second identity shard; transfer data associated with the first identity shard from the data transfer device to a data storage system; delete the first identity shard and the first derived shard; and provision a second plurality of data transfer devices to store a third bundle comprising a fourth identity shard, the third identity shard, and a third derived shard, the third bundle overlapping with the second bundle by virtue of sharing the third identity shard. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A non-transitory computer-readable storage medium having stored thereon executable instructions that, as a result of being executed by one or more processors of a computer system, cause the computer system to at least:
-
generate at least a first bundle of redundancy coded shards and a second bundle of redundancy coded shards, the first bundle including at least a first identity shard, a second identity shard, and a first derived shard, the first bundle being configured such that a first quorum quantity of shards of the first bundle is sufficient to reconstruct, using a first redundancy code, original data associated with the first identity shard, the second bundle including the second identity shard, a second derived shard, and a third identity shard, the second bundle being configured such that a second quorum quantity of shards of the second bundle is sufficient to reconstruct, using a second redundancy code, the second identity shard, the first bundle and second bundle overlapping by virtue of both including the second identity shard; monitor the first bundle for an event associated with the first identity shard; and if the event is detected, at least; store data associated with data storage requests received after the event in the second identity shard; transfer data associated with the first identity shard to a data storage system; delete the first identity shard and the first derived shard; and generate a third bundle comprising a fourth identity shard, the third identity shard, and a third derived shard, the third bundle overlapping with the second bundle by virtue of sharing the third identity shard. - View Dependent Claims (17, 18, 19, 20)
-
Specification