Hybridized storage operation for redundancy coded data storage systems
- US 10,678,664 B1
- Filed: 03/28/2016
- Issued: 06/09/2020
- Est. Priority Date: 03/28/2016
- Status: Expired due to Fees
First Claim
1. A computer-implemented method, comprising:
- configuring a data storage system to apportion a bundle of redundancy coded shards between at least durable storage of the data storage system and a data transfer device, the bundle including at least a plurality of identity shards, a first identity shard of the plurality of identity shards containing an original form of data stored in the bundle, and an encoded shard containing a redundancy coded form of the data, the bundle being configured such that a quorum quantity of shards of the bundle is sufficient to reconstruct, using a redundancy code, original data associated with the bundle, the bundle being apportioned such that a first subset of the bundle is apportioned to the data transfer device and second subset of the bundle is apportioned to the durable storage, the first subset of the bundle comprising a subset of the plurality of identity shards and excluding the encoded shard, the second subset of the bundle comprising a remainder of the bundle outside of the first subset of the bundle, the number of shards in the second subset being greater than the quorum quantity;
providing the data transfer device to a different physical location than that of the data storage system, the different physical location being associated with a customer of the data storage system, and the data transfer device is in operable communication with the data storage system;
receiving, by the data transfer device at the different physical location, a first set of customer data;
storing, by the data transfer device at the different physical location, the first set of customer data in the first subset of the bundle;
storing, by the durable storage of the data storage system, a second set of customer data in identity shards of the second subset of the bundle;
processing, using the redundancy code, the first set of customer data and the second set of customer data to generate the encoded shard;
storing, by the durable storage, the encoded shard as part of the second subset of the bundle to enable the durable storage to regenerate, from only the second subset of the bundle, the first set of customer data; and
in response to a retrieval request for at least a portion of the first set of customer data;
if the data transfer device is available, service the retrieval request by at least retrieving the portion of the first subset of customer data from the first subset of the bundle; and
if the data transfer device is unavailable, service the retrieval request by at least regenerating the portion of the first subset of customer data, using the redundancy code, from the second subset of the bundle.
1 Assignment
0 Petitions
Accused Products
Abstract
A cluster of data transfer devices is used to augment the capabilities of a data storage system. For example, the cluster of data transfer devices may be configured to store a portion of a bundle of redundancy coded shards in a similar fashion as a data storage system. As another example, the cluster may be configured to provide other capabilities incident to the devices used, such as computational capabilities. Data stored on the cluster may be read from and written directly to the cluster without transfer of data to the data storage system. In some embodiments, a connecting entity (such as a customer entity) may interchangeably interface with the data storage system and the cluster, and the requested capabilities may be directed to either in a fashion that is transparent to the requestor.
-
Citations
23 Claims
-
1. A computer-implemented method, comprising:
-
configuring a data storage system to apportion a bundle of redundancy coded shards between at least durable storage of the data storage system and a data transfer device, the bundle including at least a plurality of identity shards, a first identity shard of the plurality of identity shards containing an original form of data stored in the bundle, and an encoded shard containing a redundancy coded form of the data, the bundle being configured such that a quorum quantity of shards of the bundle is sufficient to reconstruct, using a redundancy code, original data associated with the bundle, the bundle being apportioned such that a first subset of the bundle is apportioned to the data transfer device and second subset of the bundle is apportioned to the durable storage, the first subset of the bundle comprising a subset of the plurality of identity shards and excluding the encoded shard, the second subset of the bundle comprising a remainder of the bundle outside of the first subset of the bundle, the number of shards in the second subset being greater than the quorum quantity; providing the data transfer device to a different physical location than that of the data storage system, the different physical location being associated with a customer of the data storage system, and the data transfer device is in operable communication with the data storage system; receiving, by the data transfer device at the different physical location, a first set of customer data; storing, by the data transfer device at the different physical location, the first set of customer data in the first subset of the bundle; storing, by the durable storage of the data storage system, a second set of customer data in identity shards of the second subset of the bundle; processing, using the redundancy code, the first set of customer data and the second set of customer data to generate the encoded shard; storing, by the durable storage, the encoded shard as part of the second subset of the bundle to enable the durable storage to regenerate, from only the second subset of the bundle, the first set of customer data; and in response to a retrieval request for at least a portion of the first set of customer data; if the data transfer device is available, service the retrieval request by at least retrieving the portion of the first subset of customer data from the first subset of the bundle; and if the data transfer device is unavailable, service the retrieval request by at least regenerating the portion of the first subset of customer data, using the redundancy code, from the second subset of the bundle. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system, comprising at least one computing device that implements one or more services, wherein the one or more services at least:
-
apportion a bundle of redundancy coded shards between at least durable storage of a first data storage system and a second data storage system, the bundle being configured such that a quorum quantity of shards of the bundle is sufficient to reconstruct, using a redundancy code, original data associated with the bundle, the bundle being apportioned such that a first subset of the bundle is apportioned to the durable storage and a second subset of the bundle is apportioned to the second data storage system, the first subset including a number of shards equal to or greater than the quorum quantity of shards; cause storage of a first set of customer data in the first subset of the bundle apportioned to the durable storage of the first data storage system; cause storage of a second set of customer data in the second subset of the bundle apportioned to the second data storage system, wherein the second set of customer data is received by the second data storage system while the second data storage system is at a different physical location than the first data storage system, and wherein the second data storage system may, while at the different physical location, communicate with the first data storage system; process, using the redundancy code, the first set of customer data and the second set of customer data to generate an encoded shard; store the encoded shard in the first subset of the bundle, wherein the first subset of the bundle is sufficient to enable the first storage system to reconstruct, from only the first subset of the bundle, the second set of customer data; service a retrieval request for a portion of the second set of customer data, using the redundancy code, from the first subset of the bundle; and as a result of the second data storage system being moved from the different physical location to the physical location of the data storage system, cause the second data storage system to transfer at least the second set of customer data to the first data storage system for storage in the durable storage of the first data storage system. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A non-transitory computer-readable storage medium comprising executable instructions that, as a result of being executed by one or more processors of a computer system, cause the computer system to at least:
-
generate, from a first set of data and a second set of data received by the computer system and using a redundancy code, a plurality of erasure coded shards as a bundle of bundle-encoded shards, the plurality of erasure coded shards being configured such that a quorum quantity of the shards of the plurality of erasure coded shards is sufficient to reconstruct, using the redundancy code, the first set of data and the second set of data; store the shards of the plurality of erasure coded shards on durable storage of a first data storage system and a second data storage system, such that; a second subset of the shards includes original data of the first second set of data and is apportioned to the durable storage of the first second data storage system; and a first subset of the shards includes original data of the first data, a first encoded shard generated from the original data of the first set of data, and a second encoded shard generated from the original data of the second set of data, and is apportioned to the first data storage system to enable the first data storage system to regenerate the second set of data using only the first subset of the shards, wherein the second set of data is received at the second data storage system while the second data storage system is at a different physical location than the first data storage system, and wherein the second data storage system is capable of operable communication with the first data storage system at the different physical location; service a retrieval request for a portion of the second set of data, using the redundancy code, from the first subset of the shards; and add a third set of data, received by the first data storage system, to the bundle, by at least adding the third set of data to identity shards in the second subset of the shards and updating the encoded shard in the second subset of the shards using the redundancy code. - View Dependent Claims (20, 21, 22, 23)
-
Specification