Data recovery from multiple failed data blocks and storage units
First Claim
1. A storage system with data recovery from M failed blocks per stripe or J failed storage units comprising N, N>
- 0 data blocks stored on N storage units and a first error correction code that generates M, M>
0 redundant blocks from the N data blocks where the N data blocks and M redundant blocks form a stripe, wherein the M redundant blocks are stored on J, J less than or equal to M but at least one additional storage units such that K failed data blocks, K less than or equal to M blocks are regenerated from the remaining N+M−
K blocks of the stripe;
or P storage units fail, P less than or equal to J containing data blocks wherein the data blocks in the failed storage units are regenerated from the remaining data blocks and remaining redundant blocks of the stripe.
0 Assignments
0 Petitions
Accused Products
Abstract
In the past, storage unit (disk drive) failures were the primary cause of data loss in a storage system. With higher unit reliability and higher bit density, random bit errors have become the primary cause of data loss. Most data recovery mechanisms treat reconstruction of redundant information on the same level as data reconstruction. In reality, data reconstruction is more important and asymmetry between data protection and redundant information protection provides trade-offs of data recoverability against performance. The present invention provides data recovery from both a first number of data block failures due to random bit failures and a second number of storage unit failures while providing update write performance equivalent to data protection mechanisms with lower data recovery capabilities. The level protection from number of data block failures, the number of unit failures, and update write performance are parameterized to select a desired combination.
-
Citations
18 Claims
-
1. A storage system with data recovery from M failed blocks per stripe or J failed storage units comprising N, N>
- 0 data blocks stored on N storage units and a first error correction code that generates M, M>
0 redundant blocks from the N data blocks where the N data blocks and M redundant blocks form a stripe, wherein the M redundant blocks are stored on J, J less than or equal to M but at least one additional storage units such that K failed data blocks, K less than or equal to M blocks are regenerated from the remaining N+M−
K blocks of the stripe;
or P storage units fail, P less than or equal to J containing data blocks wherein the data blocks in the failed storage units are regenerated from the remaining data blocks and remaining redundant blocks of the stripe. - View Dependent Claims (2, 3, 4, 5, 6, 7)
- 0 data blocks stored on N storage units and a first error correction code that generates M, M>
-
8. A storage system with data recovery from L failed blocks per stripe comprising N, N>
- 0 data blocks stored on H, H greater than or equal to N+1 storage units, one data block per storage unit, and a first error correction code that generates M, M>
0 redundant blocks from the N data wherein;the N data blocks and the M redundant blocks form a stripe such that K failed data blocks K less than or equal to M blocks are regenerated from the remaining N+M−
K blocks of the stripe; andL, L less than or equal to M redundant blocks are stored on the storage unit with the most recent data block update; such that when one or more of the M redundant blocks are not accessible, T, T less than or equal to L data blocks are regenerated from the remaining N+L−
T blocks of the stripe. - View Dependent Claims (9, 10, 11, 12, 13)
- 0 data blocks stored on H, H greater than or equal to N+1 storage units, one data block per storage unit, and a first error correction code that generates M, M>
-
14. A storage system with data recovery from R failed blocks per second stripe within a storage unit and J failed storage units or M failed blocks per first stripe across storage units and R failed blocks per second stripe within a storage unit comprising:
-
N, N>
0 data blocks stored on N storage units; anda first error correction code that generates M, M>
0 redundant blocks from the N data blocks wherein the N data blocks and M redundant blocks form a first stripe across storage units such that K failed data blocks, K less than or equal to M are regenerated from the remaining N+M−
K blocks of the first stripe; andthe M redundant blocks are stored on J, J less than or equal to M but at least one additional storage units; and S, S>
0 blocks stored on a storage unit including one block from the first stripe and a second error correction code that generates R, R>
0 blocks from the S data blocks wherein the S blocks and R redundant blocks form a second stripe stored within the storage unit such that V failed data blocks, V less than or equal to R blocks are regenerated from the remaining S+R−
V blocks of the second stripe;such that the storage system can regenerate R blocks of failed data blocks for an S data blocks stripe within a storage unit and regenerate either data blocks for J failed storage units or M failed data blocks per N data block stripe across the storage units. - View Dependent Claims (15, 16, 17, 18)
-
Specification