Risk based rebuild of data objects in an erasure coded storage system
First Claim
1. A computer implemented method comprising:
- transmitting requests to a plurality of storage nodes of a storage system to obtain from the storage nodes indications of availability of constituent fragments of a first data object, wherein the first data object is divided into the constituent fragments stored in the storage system according to an erasure coding technique;
quantifying a first risk of losing capability to rebuild the first data object based on the indications of the availability of the constituent fragments of the first data object for rebuilding the first data object according to the erasure coding technique; and
in response to the quantified first risk exceeding a first threshold, prioritizing rebuild of the first data object by the storage system according to the quantified first risk.
2 Assignments
0 Petitions
Accused Products
Abstract
A rebuild node of a storage system can assess risk of the storage system not being able to provide a data object. The rebuild node(s) uses information about data object fragments to determine health of a data object, which relates to the risk assessment. The rebuild node obtains object fragment information from nodes throughout the storage system. With the object fragment information, the rebuild node(s) can assess object risk based, at least in part, on the object fragments indicated as existing by the nodes. To assess object risk, the rebuild node(s) treats absent object fragments (i.e., those for which an indication was not received) as lost. When too many object fragments are lost, an object cannot be rebuilt. The erasure coding technique dictates the threshold number of fragments for rebuilding an object. The risk assessment per object influences rebuild of the objects.
92 Citations
20 Claims
-
1. A computer implemented method comprising:
-
transmitting requests to a plurality of storage nodes of a storage system to obtain from the storage nodes indications of availability of constituent fragments of a first data object, wherein the first data object is divided into the constituent fragments stored in the storage system according to an erasure coding technique; quantifying a first risk of losing capability to rebuild the first data object based on the indications of the availability of the constituent fragments of the first data object for rebuilding the first data object according to the erasure coding technique; and in response to the quantified first risk exceeding a first threshold, prioritizing rebuild of the first data object by the storage system according to the quantified first risk. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. One or more non-transitory machine-readable media including program code for data object rebuild according to an erasure code of a storage system, the program code configured to:
-
in response to receipt of indications of availability of data object fragments for data object rebuild from a plurality of storage nodes of the storage system, quantify risks of losing a capability to rebuild each of one or more of data objects corresponding to the data object fragments based on a count of data object fragments indicated as available for rebuild of a corresponding data object relative to a minimum number of data object fragments according to the erasure code; prioritize rebuild of the data objects based on the quantified risks of the data objects; and initiate rebuild, within a background execution space, of the data objects according to the prioritization. - View Dependent Claims (14, 15, 16)
-
-
17. An apparatus comprising:
-
a hardware processor; and a machine-readable medium including program code executable by the processor to cause the apparatus to, obtain, from a plurality of storage nodes of a storage system, indications of availability of constituent fragments for data objects, wherein each of the data objects is divided into constituent fragments stored in the storage system according to an erasure coding technique; for each of the data objects, quantify risk of losing capability to rebuild the respective data object based on the indications of availability of the constituent fragments for the respective data object; compute a risk value as a function of a count of available constituent fragments and a minimum number of constituent fragments to rebuild each data object according to the erasure coding technique; and request rebuild of those of the data objects with a corresponding risk value that exceeds a threshold such that the rebuild of the data objects is prioritized according to the corresponding risk. - View Dependent Claims (18, 19, 20)
-
Specification