Precomputed redundancy code matrices for high-availability data storage
First Claim
Patent Images
1. A computer-implemented method, comprising:
- under the control of one or more computer systems configured with executable instructions,determining, based on parameters of a redundancy code used to encode original data of an archive into a plurality of shards, at least one regeneration set, the regeneration set consisting of a subset of the plurality of shards and having a number of members equal to or greater than a minimum quorum of the plurality of shards necessary to regenerate the original data;
computing one or more matrices specific to the at least one regeneration set and corresponding to the redundancy code, the one or more matrices capable of being used with the corresponding subset of the plurality of shards to regenerate the original data;
storing the computed matrices;
at a time after detecting that one or more shards of the plurality of shards is unavailable, retrieving a matrix of the stored matrices that corresponds to a regeneration set that does not include the unavailable shards as members;
regenerating the original data using the retrieved matrix and the corresponding regeneration set; and
regenerating, using the redundancy code, a regenerated shard to replace the unavailable shard from the regenerated original data.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques described and suggested herein include systems and methods for precomputing regeneration information for data archives (“archives”) that have been processed and stored using redundancy coding techniques. For example, regeneration information, such as redundancy code-related matrices (such as inverted matrices based on, e.g., a generator matrix for the selected redundancy code) corresponding to subsets of the shards, is computed for each subset and, in some embodiments, stored for use in the event that one or more shards becomes unavailable, e.g., so as to more efficiently and/or quickly regenerate a replacement shard.
58 Citations
20 Claims
-
1. A computer-implemented method, comprising:
under the control of one or more computer systems configured with executable instructions, determining, based on parameters of a redundancy code used to encode original data of an archive into a plurality of shards, at least one regeneration set, the regeneration set consisting of a subset of the plurality of shards and having a number of members equal to or greater than a minimum quorum of the plurality of shards necessary to regenerate the original data; computing one or more matrices specific to the at least one regeneration set and corresponding to the redundancy code, the one or more matrices capable of being used with the corresponding subset of the plurality of shards to regenerate the original data; storing the computed matrices; at a time after detecting that one or more shards of the plurality of shards is unavailable, retrieving a matrix of the stored matrices that corresponds to a regeneration set that does not include the unavailable shards as members; regenerating the original data using the retrieved matrix and the corresponding regeneration set; and regenerating, using the redundancy code, a regenerated shard to replace the unavailable shard from the regenerated original data. - View Dependent Claims (2, 3, 4, 5)
-
6. A system, comprising:
at least one computing device configured to implement one or more services, wherein the one or more services are configured to; determine one or more regeneration sets from a plurality of shards, the plurality of shards having been generated by applying a redundancy code to original data of an archive, the regeneration sets including a subset of the plurality of shards capable of regenerating the original data; compute information from the redundancy code and corresponding to the regeneration sets, the computed information being usable to regenerate a replacement shard for at least a subset of the shards in the plurality; and at a time after detecting that one or more shards of the plurality of shards is unavailable, use the computed information to regenerate data associated with the unavailable shards. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
13. A non-transitory computer-readable storage medium having stored thereon executable instructions that, when executed by one or more processors of a computer system, cause the computer system to at least:
-
determine one or more regeneration sets from a plurality of redundancy coded shards stored for an archive, each of the regeneration sets having a number of member shards equal to a quorum necessary to regenerate the archive; compute information for each of the regeneration sets that is usable by the computer system to regenerate any of the plurality of redundancy coded shards using the member shards of a respective regeneration set; and at a time after detecting that one or more shards of the plurality of shards is unavailable, use at least a subset of the computed information to generate data to replace the unavailable shards. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification