Early raid rebuild to improve reliability
First Claim
1. A method of minimizing rebuild times within a large-scale data storage system comprising:
- maintaining a spare disk for a plurality of disks within a disk array;
monitoring the plurality of disks for occurrence of one or more pre-failure indicators;
maintaining, for each disk, a count of the occurrences of the one or more pre-failure indicators;
comparing the count for each disk to a defined threshold value;
copying a first disk and mirroring write operations to the first disk to the spare disk if the count for the first disk exceeds the threshold; and
switching from copying to the spare disk from the first disk to copying to the spare disk from a second disk if the count for the second disk exceeds the count for the first disk.
9 Assignments
0 Petitions
Accused Products
Abstract
A method of minimizing rebuild times within a large-scale data storage system, such as a RAID array by: maintaining a spare disk for a plurality of disks within a disk array; monitoring the plurality of disks for occurrence of one or more pre-failure indicators; maintaining, for each disk, a count of the occurrences of the pre-failure indicators; comparing the count for each disk to a defined threshold value; and copying the first disk and mirroring write operations to the first disk to the spare disk if the count for the first disk exceeds the threshold. The method switches the copying to the spare disk from the first disk to a second disk if the count for the second disk exceeds the count for the first disk. In this manner, certain predictive information can be used to use the spare disk to reduce RAID rebuild times to near instantaneous periods.
-
Citations
16 Claims
-
1. A method of minimizing rebuild times within a large-scale data storage system comprising:
-
maintaining a spare disk for a plurality of disks within a disk array; monitoring the plurality of disks for occurrence of one or more pre-failure indicators; maintaining, for each disk, a count of the occurrences of the one or more pre-failure indicators; comparing the count for each disk to a defined threshold value; copying a first disk and mirroring write operations to the first disk to the spare disk if the count for the first disk exceeds the threshold; and switching from copying to the spare disk from the first disk to copying to the spare disk from a second disk if the count for the second disk exceeds the count for the first disk. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for minimizing rebuild times within a RAID system, comprising:
-
a spare disk maintained for a plurality of disks within a disk array of the RAID system; a monitor component monitoring the plurality of disks for occurrence of one or more pre-failure indicators;
a counter maintaining, for each disk, a count of the occurrences of the one or more pre-failure indicators;a comparator comparing the count for each disk to a defined threshold value; and a backup component copying a first disk and mirroring write operations to the first disk to the spare disk if the count for the first disk exceeds the threshold and switching from copying to the spare disk from the first disk to copying to the spare disk from a second disk if the count for the second disk exceeds the count for the first disk. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A non-transitory computer-readable medium having stored thereon a program containing executable instructions causing a processor-based computer to perform, within a disk storage system having disk arrays with at least one associated spare disk, a method comprising:
-
monitoring the plurality of disks for occurrence of one or more pre-failure indicators; maintaining, for each disk, a count of the occurrences of the one or more pre-failure indicators; comparing the count for each disk to a defined threshold value;
copying a first disk and mirroring write operations to the first disk to the spare disk if the count for the first disk exceeds the threshold; andswitching from copying to the spare disk from the first disk to copying to the spare disk from a second disk if the count for the second disk exceeds the count for the first disk.
-
Specification