Localized failure mode decorrelation in redundancy encoded data storage systems
First Claim
Patent Images
1. A computer-implemented method, comprising:
- processing data to be stored via a plurality of hosts, each host of the plurality of hosts having a plurality of data storage devices, by at least;
applying a redundancy code to the data so as to generate a plurality of shards, the plurality of shards having a quorum quantity which is less than a quantity of shards in the plurality of shards, the quorum quantity sufficient to regenerate any other shard of the plurality of shards; and
allocating the plurality of shards to respective hosts of the plurality of hosts so as to decorrelate a first failure mode associated with the plurality of shards by decorrelating a first failure event of a first shard of the plurality of shards from a second failure event of a second shard of the plurality of shards;
causing each host of the plurality of hosts to randomly select, by a selector, a selected data storage device of the plurality of data storage devices for storage of shards allocated to the host, so as to decorrelate a second failure mode by decorrelating a third failure event of the selected storage device of the plurality of data storage devices from a fourth failure event of a second data storage device of the plurality of data storage devices; and
causing storage, on the plurality of hosts, of the plurality of shards on the hosts in accordance with the random selections of the selector.
1 Assignment
0 Petitions
Accused Products
Abstract
A data storage system, such as an archival storage system, implements failure decorrelation methods. In some embodiments, a selector is employed to select one or more data storage devices of a host for storage of incoming data. In some of such embodiments, the selector selects from among the storage devices in a random, pseudorandom, stochastic, or deterministic fashion so as to prevent correlation of one or more failure modes associated with storage of the data.
-
Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
processing data to be stored via a plurality of hosts, each host of the plurality of hosts having a plurality of data storage devices, by at least; applying a redundancy code to the data so as to generate a plurality of shards, the plurality of shards having a quorum quantity which is less than a quantity of shards in the plurality of shards, the quorum quantity sufficient to regenerate any other shard of the plurality of shards; and allocating the plurality of shards to respective hosts of the plurality of hosts so as to decorrelate a first failure mode associated with the plurality of shards by decorrelating a first failure event of a first shard of the plurality of shards from a second failure event of a second shard of the plurality of shards; causing each host of the plurality of hosts to randomly select, by a selector, a selected data storage device of the plurality of data storage devices for storage of shards allocated to the host, so as to decorrelate a second failure mode by decorrelating a third failure event of the selected storage device of the plurality of data storage devices from a fourth failure event of a second data storage device of the plurality of data storage devices; and causing storage, on the plurality of hosts, of the plurality of shards on the hosts in accordance with the random selections of the selector. - View Dependent Claims (2, 3, 4)
-
-
5. A system, comprising at least one computing device that implements one or more services, wherein the one or more services at least:
-
apply a redundancy code to data to be stored via a plurality of hosts, each host of the plurality of hosts having a plurality of data storage devices, so as to generate a plurality of shards having a quorum quantity of shards, less than a quantity of shards in the plurality of shards, being sufficient to regenerate any other shard of the plurality of shards; cause, by a selector implemented by the one or more services, each host of the plurality of hosts to select a selected data storage device of the plurality of data storage devices for storage of shards allocated to the host, so as to decorrelate a set of related events associated with the plurality of data storage devices such that a respective event of the set of related events affects a smaller subset of the plurality of data storage devices selected for storage of a shard relative to not decorellating the set of related events, each event of the set of related events being associated with failure of one or more data storage devices of the plurality of data storage devices; and store, on the plurality of hosts, the plurality of shards via the hosts on a plurality of respective selected data storage devices. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A set of one or more non-transitory computer-readable storage media having stored thereon executable instructions that, as a result of being executed by one or more processors of a computer system, cause the computer system to:
-
generate, by a redundancy code, a plurality of shards from received data, the plurality of shards to be stored via a plurality of hosts, each host of the plurality of hosts having a plurality of data storage devices, the plurality of shards having a quorum quantity of shards that is less than a quantity of shards in the plurality of shards, the quorum quantity of shards being sufficient, via application of the redundancy code, to regenerate any other shard of the plurality; select, by a selector implemented by the computer system, for each host of the plurality of hosts, a selected data storage device of the plurality of data storage devices of the host for storage of a subset of the plurality of shards allocated to the host, so as to, in response to a storage failure event capable of occurring to a plural subset of the plurality of data storage devices, increase a first probability that at least a quorum of shards stored across the plurality of shards remains available relative to a second probability of availability of at least a quorum of shards prior to causing the computer system to select the selected data storage device; and store, on the plurality of hosts, the plurality of shards via the hosts on a plurality of respective selected data storage devices. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification