Managing defective media in a RAID system
First Claim
1. A RAID system having redundancy for use with removable and identifiable storage devices that can be attached to said RAID system at a plurality of different attachment points, said RAID system comprising:
- means for identifying a failed storage device and removing said failed storage device from said RAID system;
means for reconstructing data stored on said failed storage device, for recording on a replacement storage device, from data and redundant data stored on the remaining storage devices;
means for detecting storage areas of said remaining storage devices that contain media defects;
means for recording in non-volatile storage, identification of areas of said remaining storage devices that contain media defects;
means for continuing said reconstructing of data stored on said failed storage device, for recording on said replacement storage device, from data and redundant data stored on said remaining storage devices.
3 Assignments
0 Petitions
Accused Products
Abstract
Means and method are disclosed for managing data while a RAID system is recovering from a media error. As a media error occurs, the failing storage device is identified and the areas of failure are recorded in non-volatile storage. A data recovery process is then continued so that a maximum amount of data can be recovered even though more than one error has occurred. Areas of failure are recorded in both non-volatile memory on the RAID adapter card and also in reserved areas of remaining storage devices. The storage areas that have been detected to contain media errors are stripe number, stripe unit number and also down to the sector number level of granularity. When the user tries to access data, these records are checked. If there is an entry in the table for a stripe being accessed, the user will receive an error message. Although the user may lose a small portion of the data, the user is only presented with an error message instead of incorrect data. The table can also be checked on write operations. If an entire stripe of data is written successfully and that stripe is found in the table, the entry is removed. When a physical device is moved to another controller, the table is copied from the physical device being moved to the new controllers non-volatile memory so this information is not lost.
-
Citations
14 Claims
-
1. A RAID system having redundancy for use with removable and identifiable storage devices that can be attached to said RAID system at a plurality of different attachment points, said RAID system comprising:
-
means for identifying a failed storage device and removing said failed storage device from said RAID system;
means for reconstructing data stored on said failed storage device, for recording on a replacement storage device, from data and redundant data stored on the remaining storage devices;
means for detecting storage areas of said remaining storage devices that contain media defects;
means for recording in non-volatile storage, identification of areas of said remaining storage devices that contain media defects;
means for continuing said reconstructing of data stored on said failed storage device, for recording on said replacement storage device, from data and redundant data stored on said remaining storage devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
means for accessing data stored on said remaining storage devices including;
means for checking records containing identification of storage areas of said remaining storage devices that contain media defects;
means for returning an error message when said means for checking finds a record of a media defect for an area of storage of said remaining storage devices where data being accessed is stored.
-
-
8. The RAID system of claim 7 wherein said record identifies a stripe where said data being accessed is stored.
-
9. The RAID system of claim 7 wherein said record identifies a stripe unit of a stripe where said data being accessed is stored.
-
10. The RAID system of claim 7 wherein said record identifies a sector where said data being accessed is stored.
-
11. The RAID system of claim 7 further comprising:
-
means for writing data to a portion of said remaining storage devices including;
means for checking whether said data was successfully written to said portion of said remaining storage devices;
means for checking records containing identification of storage areas of said remaining storage devices that contain media defects;
means for removing a record of a media defect for an area of storage of said remaining storage devices that is included in said portion which was successfully written.
-
-
12. The RAID system of claim 1 wherein said means for recording records said identification of areas of said remaining storage devices that contain media defects in a plurality of tables, one table containing the records of media defects for one logical device.
-
13. The RAID system of claim 1 wherein said means for recording records said identification of areas of said remaining storage devices that contain media defects in a plurality of tables, one table containing the records of media defects for one physical storage device of said remaining storage devices.
-
14. The RAID system of claim 1 wherein said means forrecording records said identification of areas of said remaining storage devices that contain media defects in a non-volatile random access memory on an adapter circuit card of said RAID system, and also records said identification of areas of said remaining storage devices that contain media defects in a reserved area of each of said remaining storage devices, said system further comprising:
means for copying said records of said identification of areas of said remaining storage devices that contain media defects from a reserved area of a remaining storage device that has been moved to another RAID system to a non-volatile random access memory on an adapter circuit card of said another RAID system.
Specification