Single quorum verification of erasure coded data
First Claim
Patent Images
1. A computer-implemented method, comprising:
- for stored data shards of an original data element, identifying a subset of the stored data shards sufficient for reconstructing the original data element, the stored data shards collectively representing an erasure coded version of the original data element;
verifying integrity of the identified subset of the stored data shards by at least;
generating a version of the original data element from the identified subset of the stored data shards; and
verifying the version of the original data element by comparing at least one first hash value associated with the original data element with at least one second hash value associated with the version of the original data element;
reconstructing, using the version of the original data element, data shards outside of the identified subset of the stored data shards, thereby generating reconstructed data shards;
calculating one or more third hash values for the reconstructed data shards; and
verifying integrity of the stored data shards by comparing the calculated third hash value with one or more fourth hash values associated with one or more of the stored data shards outside of the identified subset of the stored data shards; and
initiating a mitigation workflow if, when verifying the integrity of the stored data shards, at least one of the stored data shards is identified as invalid.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques described and suggested herein include various methods and systems for verifying integrity of redundancy coded data, such as erasure coded data shards. In some embodiments, a quantity of redundancy coded data elements, hereafter referred to as data shards (e.g., erasure coded data shards), sufficient to reconstruct the original data element from which the redundancy coded data elements are derived, is used to generate reconstructed data shards to be used for checking the validity of analogous data shards stored for the original data element.
16 Citations
16 Claims
-
1. A computer-implemented method, comprising:
-
for stored data shards of an original data element, identifying a subset of the stored data shards sufficient for reconstructing the original data element, the stored data shards collectively representing an erasure coded version of the original data element; verifying integrity of the identified subset of the stored data shards by at least; generating a version of the original data element from the identified subset of the stored data shards; and verifying the version of the original data element by comparing at least one first hash value associated with the original data element with at least one second hash value associated with the version of the original data element; reconstructing, using the version of the original data element, data shards outside of the identified subset of the stored data shards, thereby generating reconstructed data shards; calculating one or more third hash values for the reconstructed data shards; and verifying integrity of the stored data shards by comparing the calculated third hash value with one or more fourth hash values associated with one or more of the stored data shards outside of the identified subset of the stored data shards; and initiating a mitigation workflow if, when verifying the integrity of the stored data shards, at least one of the stored data shards is identified as invalid. - View Dependent Claims (2, 3)
-
-
4. A system for storing data shards, comprising:
-
one or more processors; and memory storing instructions executed by the one or more processors to cause the system; receive a least one first hash value associated with an original data element, the original data element stored in a subset of the stored data shards, the stored data shards collectively representing an erasure coded version of the original data element, the integrity of the identified subset of the stored data shards having been verified by at least; generating a version of the original data element from the identified subset of the stored data shards; and verifying the version of the original data element by comparing at least one first hash value associated with the original data element with at least one second hash value associated with the version of the original data element; reconstruct, using the version of the original data element, data shards outside of the identified subset of the stored data shards, thereby generating reconstructed data shards; calculate one or more third hash values for the reconstructed data shards; and
verifying integrity of the stored data shards by comparing the calculated third hash value with one or more fourth hash values associated with one or more of the stored data shards outside of the identified subset of the stored data shards; andinitiate a mitigation workflow if the integrity of at least one of the stored data shards is identified as invalid. - View Dependent Claims (5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer readable storage medium having stored thereon executable instructions that, when executed by one or more processors of a computer s system, cause the computer system to a least:
-
receive a least one first hash value associated with an original data element, the original data element stored in a subset of the stored data shards, the stored data shards collectively representing an erasure coded version of the original data element, the integrity of the identified subset of the stored data shards having been verified by at least; generating a version of the original data element from the identified subset of the stored data shards; and verifying the version of the original data element by comparing at least one first hash value associated with the original data element with at least one second hash value associated with the version of the original data element; reconstruct, using the version of the original data element, data shards outside of the identified subset of the stored data shards, thereby generating reconstructed data shards; calculate one or more third hash values for the reconstructed data shards; and
verifying integrity of the stored data shards by comparing the calculated third hash value with one or more fourth hash values associated with one or more of the stored data shards outside of the identified subset of the stored data shards; andinitiate a mitigation workflow if the integrity of at least one of the stored data shards is identified as invalid. - View Dependent Claims (12, 13, 14, 15, 16)
-
Specification