PRECOMPUTED REDUNDANCY CODE MATRICES FOR HIGH-AVAILABILITY DATA STORAGE
First Claim
1. A computer-implemented method, comprising:
- under the control of one or more computer systems configured with executable instructions,receiving a first request to store an archive;
encoding, using a redundancy code, original data of the archive into a plurality of shards;
determining, from the plurality of shards and the redundancy code, at least one regeneration set, the regeneration set consisting of a subset of the plurality of shards and having a number of members equal to or greater than a minimum quorum of the plurality of shards sufficient to regenerate the original data;
computing one or more matrices for the regeneration set, the one or more matrices capable of being used with the corresponding subset of the plurality of shards to regenerate the original data;
storing the computed matrices and the plurality of shards;
in response to a second request, retrieving a matrix of the stored matrices that corresponds to a regeneration set associated with the second request; and
regenerating one or more shards associated with the second request using the retrieved matrix and the corresponding regeneration set.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques described and suggested herein include systems and methods for precomputing regeneration information for data archives (“archives”) that have been processed and stored using redundancy coding techniques. For example, regeneration information, such as redundancy code-related matrices (such as inverted matrices based on, e.g., a generator matrix for the selected redundancy code) corresponding to subsets of the shards, is computed for each subset and, in some embodiments, stored for use in the event that one or more shards becomes unavailable, e.g., so as to more efficiently and/or quickly regenerate a replacement shard.
-
Citations
20 Claims
-
1. A computer-implemented method, comprising:
under the control of one or more computer systems configured with executable instructions, receiving a first request to store an archive; encoding, using a redundancy code, original data of the archive into a plurality of shards; determining, from the plurality of shards and the redundancy code, at least one regeneration set, the regeneration set consisting of a subset of the plurality of shards and having a number of members equal to or greater than a minimum quorum of the plurality of shards sufficient to regenerate the original data; computing one or more matrices for the regeneration set, the one or more matrices capable of being used with the corresponding subset of the plurality of shards to regenerate the original data; storing the computed matrices and the plurality of shards; in response to a second request, retrieving a matrix of the stored matrices that corresponds to a regeneration set associated with the second request; and regenerating one or more shards associated with the second request using the retrieved matrix and the corresponding regeneration set. - View Dependent Claims (2, 3, 4, 5)
-
6. A system, comprising:
-
one or more processors; and memory comprising one or more instructions that, as a result of being executed by the one or more processors, cause the system to at least; receive a request to store an archive; generate, using a redundancy code, a plurality of shards from original data of the archive; determine one or more regeneration sets from the plurality of shards, the regeneration sets including a subset of the plurality of shards capable of regenerating the original data; compute information from the redundancy code and corresponding to the regeneration sets, the computed information being usable to regenerate a replacement shard for at least a subset of the shards in the plurality; and in response to a second request, use the computed information to regenerate data associated with the second request. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium having stored thereon executable instructions that, as a result of being executed by one or more processors of a computer system, cause the computer system to at least:
-
receive a first request associated with storing an archive; determine one or more regeneration sets from a plurality of redundancy coded shards stored for the archive, each of the regeneration sets having a number of member shards equal to a quorum sufficient to regenerate the archive; compute information for each of the regeneration sets that is usable by the computer system to regenerate any of the plurality of redundancy coded shards using the member shards of a respective regeneration set; receive a second request associated with one or more unavailable shards of the plurality of redundancy coded shards; and use at least a subset of the computed information to generate data to replace the unavailable shards. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification