Partial disk failures and improved storage resiliency
First Claim
Patent Images
1. A mass data storage system comprising:
- a plurality of member disks and a spare disk which are organized into an array, each member disk and the spare disk having a plurality of heads and a plurality of platter surfaces respectively serviced by each head during read and write operations;
a disk controller of each disk which controls the heads and platter surfaces during the read and write operations and which recognizes errors arising from each head performing read and write operations on the platter surface serviced by each head, the disk controller designating any one head as faulty whenever read and write errors associated with that one head meet a predetermined threshold; and
an array controller which communicates with each disk controller and controls the operation of each disk in the array during mass data storage operations, the array controller controlling the disk controllers of the spare disk and the member disk having the designated faulty head to (a), read data from the surfaces of the member disk serviced by non-faulty heads and write that data onto surfaces of the spare disk, (b) to rebuild data on the spare disk written on the surface of the member disk serviced by the faulty head, and (c) thereafter perform subsequent mass data storage read and write operations on the spare disk which would otherwise be addressed to the member disk having the faulty head.
1 Assignment
0 Petitions
Accused Products
Abstract
A mass data storage system including a hard disk drive comprising heads and platter surfaces determines when a head of the disk is faulty and the disk continues to operate as a partially failed disk with respect to the remaining heads which are not faulty. A striped parity disk array comprises disks capable of operating as partially failed disks allows copying of data from the platter surfaces not associated with a faulty head of a partially failed disk to a spare disk which reduces the amount of data that must be rebuilt in the rebuild process, thereby reducing the amount of time the array spends in degraded mode exposed to a total loss of data caused by a subsequent disk failure.
-
Citations
18 Claims
-
1. A mass data storage system comprising:
-
a plurality of member disks and a spare disk which are organized into an array, each member disk and the spare disk having a plurality of heads and a plurality of platter surfaces respectively serviced by each head during read and write operations; a disk controller of each disk which controls the heads and platter surfaces during the read and write operations and which recognizes errors arising from each head performing read and write operations on the platter surface serviced by each head, the disk controller designating any one head as faulty whenever read and write errors associated with that one head meet a predetermined threshold; and an array controller which communicates with each disk controller and controls the operation of each disk in the array during mass data storage operations, the array controller controlling the disk controllers of the spare disk and the member disk having the designated faulty head to (a), read data from the surfaces of the member disk serviced by non-faulty heads and write that data onto surfaces of the spare disk, (b) to rebuild data on the spare disk written on the surface of the member disk serviced by the faulty head, and (c) thereafter perform subsequent mass data storage read and write operations on the spare disk which would otherwise be addressed to the member disk having the faulty head. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of reducing the risk of data becoming unrecoverable in a mass data storage system including a plurality of disks each having a plurality of heads and a plurality of platter surfaces respectively serviced by the heads during read and write operations, at least one of the plurality of disks storing data and at least one of the plurality of disks being a spare disk, the method further comprising:
-
writing data to platter surfaces with the heads in write operations; reading data from platter surfaces with the heads in read operations; detecting errors during read operations; associating each error with one of the heads from which the error arose; designating one of the heads as faulty whenever the errors associated with that head meet a predetermined threshold; continuing to perform read and write operations with the non-faulty heads of the disks; copying data from the platter surfaces serviced by the non-faulty heads of the disk having the faulty head to the spare disk; and restoring the data that was on the platter surface serviced by the faulty head onto the spare disk without copying data from the platter surface serviced by the faulty head. - View Dependent Claims (13, 14, 15)
-
-
16. A method for copying data from a first hard disk having one of a plurality of heads designated as faulty to a second hard disk, each of the plurality of heads of the first hard disk associated with a different set of physical blocks which store data, the first hard disk associating each of a plurality of logical block addresses with a different physical block, the method using a host computer connected to the first hard disk and the second hard disk, the host computer issuing write and read commands pertaining to specific logical block addresses to the first hard disk which result in the first hard disk respectively writing data to and reading data from the physical blocks associated with the specific logical block addresses, and wherein the host computer:
-
copies data from the first hard disk to the second hard disk by issuing to the first hard disk read commands pertaining to logical block addresses associated with non-faulty heads of the first hard disk, the first hard disk supplying the data to the host computer in response to the read commands, the host computer issuing write commands to store the data read from the first hard disk on the second hard disk; and restores data that was previously stored on the physical blocks serviced by the faulty head to the second hard disk without copying the data from the first hard disk. - View Dependent Claims (17, 18)
-
Specification