DEFERRED, BULK MAINTENANCE IN A DISTRIBUTED STORAGE SYSTEM
First Claim
1. A method comprising:
- determining a failed capacity of a distributed storage system, wherein the distributed storage system includes a plurality of storage nodes, wherein the plurality of storage nodes include at least one storage device to store data objects, wherein the data objects are divided into data fragments in the distributed storage system;
determining a protection capacity of the distributed storage system, wherein the protection capacity comprises storage configured to store at least a portion of the data fragments generated to allow the data objects to be rebuilt in response to at least a part of the data objects being either lost or corrupted;
determining a first probability that the failed capacity overlaps with the protection capacity of the distributed storage system prior to a next periodically scheduled maintenance of the distributed storage system;
determining whether the first probability exceeds a first risk threshold; and
in response to the first probability exceeding the first risk threshold, scheduling a next maintenance of the distributed storage system that comprises reducing the failed capacity.
2 Assignments
0 Petitions
Accused Products
Abstract
Failed capacity of a distributed storage system is determined. The distributed storage system includes a plurality of storage nodes, wherein the plurality of storage nodes include at least one storage device to store data objects, wherein the data objects have been divided into constituent fragments in the distributed storage system. Protection capacity of the distributed storage system is determined. Protection capacity includes the data fragments generated to allow the data objects to be rebuilt in response to at least a part of the data objects being either lost or corrupted. A probability is determined that the failed capacity overlaps with the used capacity of the distributed storage system prior to a next periodically scheduled maintenance of the distributed storage system. In response to the probability exceeding a risk threshold, a next maintenance of the distributed storage system is scheduled that comprises reducing the failed capacity.
-
Citations
20 Claims
-
1. A method comprising:
-
determining a failed capacity of a distributed storage system, wherein the distributed storage system includes a plurality of storage nodes, wherein the plurality of storage nodes include at least one storage device to store data objects, wherein the data objects are divided into data fragments in the distributed storage system; determining a protection capacity of the distributed storage system, wherein the protection capacity comprises storage configured to store at least a portion of the data fragments generated to allow the data objects to be rebuilt in response to at least a part of the data objects being either lost or corrupted; determining a first probability that the failed capacity overlaps with the protection capacity of the distributed storage system prior to a next periodically scheduled maintenance of the distributed storage system; determining whether the first probability exceeds a first risk threshold; and in response to the first probability exceeding the first risk threshold, scheduling a next maintenance of the distributed storage system that comprises reducing the failed capacity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory machine readable medium having stored thereon instructions for performing a method comprising machine executable code which when executed by at least one machine, causes the at least one machine to:
-
determine a failed capacity of the distributed storage system, wherein the distributed storage system includes a plurality of storage nodes, wherein the plurality of storage nodes include at least one storage device to store data objects, wherein the data objects are divided into data fragments in the distributed storage system; determine a protection capacity of the distributed storage system, wherein the protection capacity comprises storage configured to store at least a portion of the data fragments generated to allow the data objects to be rebuilt in response to at least a part of the data objects being either lost or corrupted; determine a first probability that the failed capacity overlaps with the protection capacity of the distributed storage system prior to a next periodically scheduled maintenance of the distributed storage system; determine whether the first probability exceeds a first risk threshold; and in response to the probability exceeding the risk threshold, schedule the next maintenance of the distributed storage system that comprises a reduction of the failed capacity. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computing device comprising:
-
a processor; and a machine readable medium comprising machine executable code having stored thereon instructions executable by the processor to cause the computing device to; determine a used capacity of a distributed storage system, wherein the distributed storage system includes a plurality of storage nodes, wherein the plurality of storage nodes include at least one storage device to store data objects, wherein the data objects are divided into constituent fragments in the distributed storage system; determine a failed capacity of the distributed storage system; determine a probability that the failed capacity overlaps with the used capacity of the distributed storage system prior to a next periodically scheduled maintenance of the distributed storage system; determine whether the probability exceeds a risk threshold; and in response to the probability exceeding the risk threshold, schedule, prior to the next periodically scheduled maintenance, an intermittent bulk maintenance of the distributed storage system that comprises a reduction of the failed capacity. - View Dependent Claims (18, 19, 20)
-
Specification