Method for erasure coding data across a plurality of data stores in a network
First Claim
1. An apparatus, comprising:
- one or more processors;
a set of data stores in which data blocks and their associated recovery blocks are stored using an erasure encoding and decoding scheme, the set of data stores interoperably coupled to the one or more processors, the one or more processors programmed with code;
the code executable in the one or more processors during erasure encoding to build and maintain a data structure in a memory, the data structure storing information associated with the data blocks and their associated recovery blocks that identify a sequence of changes to the data blocks, and the code executable in the one or more processors following a given failure event to use the information in the data structure to facilitate a recovery operation, the one or more processors programmed with the executable code to perform at least one of steps (a)-(c);
(a) storing a sequence number in a data structure having positions corresponding to the data blocks and their associated recovery blocks;
(b) as a given data block is changed, (i) incrementing the sequence number, (ii) associating the incremental sequence number with the given data block, and (iii) associating the incremental sequence number with recovery blocks in the data structure; and
(c) upon a given failure event, using the sequence numbers in the data structure to recover a data set associated with the sequence numbers in the data structure.
24 Assignments
0 Petitions
Accused Products
Abstract
An efficient method to apply an erasure encoding and decoding scheme across dispersed data stores that receive constant updates. A data store is a persistent memory for storing a data block. Such data stores include, without limitation, a group of disks, a group of disk arrays, or the like. An encoding process applies a sequencing method to assign a sequence number to each data and checksum block as they are modified and updated onto their data stores. The method preferably uses the sequence number to identify data set consistency. The sequencing method allows for self-healing of each individual data store, and it maintains data consistency and correctness within a data block and among a group of data blocks. The inventive technique can be applied on many forms of distributed persistent data stores to provide failure resiliency and to maintain data consistency and correctness.
104 Citations
7 Claims
-
1. An apparatus, comprising:
-
one or more processors; a set of data stores in which data blocks and their associated recovery blocks are stored using an erasure encoding and decoding scheme, the set of data stores interoperably coupled to the one or more processors, the one or more processors programmed with code; the code executable in the one or more processors during erasure encoding to build and maintain a data structure in a memory, the data structure storing information associated with the data blocks and their associated recovery blocks that identify a sequence of changes to the data blocks, and the code executable in the one or more processors following a given failure event to use the information in the data structure to facilitate a recovery operation, the one or more processors programmed with the executable code to perform at least one of steps (a)-(c); (a) storing a sequence number in a data structure having positions corresponding to the data blocks and their associated recovery blocks; (b) as a given data block is changed, (i) incrementing the sequence number, (ii) associating the incremental sequence number with the given data block, and (iii) associating the incremental sequence number with recovery blocks in the data structure; and (c) upon a given failure event, using the sequence numbers in the data structure to recover a data set associated with the sequence numbers in the data structure. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification