Method and apparatus for efficient fault-tolerant disk drive replacement in raid storage systems
First Claim
1. An apparatus for improving fault tolerance of a storage system, the apparatus comprising:
- a. a first set of disk drives;
b. a second set of disk drives, the second set of disk drives in power-off condition;
c. a processing unit, the processing unit comprising;
i. a drive replacement logic unit, the drive replacement logic unit identifying a failing disk drive from the first set of disk drives;
ii. a drive control unit, the drive control unit receiving an indication from the drive replacement logic unit to replace the failing disk drive with a spare disk drive from the second set of disk drives, the drive control unit powering-on the spare disk drive to replace the failing disk drive; and
d. a memory unit, the memory unit storing drive health status data and information for the first set of disk drives.
13 Assignments
0 Petitions
Accused Products
Abstract
An apparatus and a method for improving the fault tolerance of storage systems by replacing disk drives, which are about to fail, are disclosed. The set of disk drives in a storage system are monitored to identify failing disk drives. A processing unit identifies the failing disk drive and selects a spare disk drive to replace the failing disk drive. The selected spare disk drive is powered on, and data from the failing disk drive is copied to the selected spare disk drive. A memory unit stores attributes and sensor data for the disk drives in the storage system. The attributes and sensor data are used by the processing unit to identify a failing disk drive. Attributes for disk drives are obtained by using SMART, and sensor data is obtained from environmental sensors such as temperature and vibration sensors.
545 Citations
30 Claims
-
1. An apparatus for improving fault tolerance of a storage system, the apparatus comprising:
-
a. a first set of disk drives;
b. a second set of disk drives, the second set of disk drives in power-off condition;
c. a processing unit, the processing unit comprising;
i. a drive replacement logic unit, the drive replacement logic unit identifying a failing disk drive from the first set of disk drives;
ii. a drive control unit, the drive control unit receiving an indication from the drive replacement logic unit to replace the failing disk drive with a spare disk drive from the second set of disk drives, the drive control unit powering-on the spare disk drive to replace the failing disk drive; and
d. a memory unit, the memory unit storing drive health status data and information for the first set of disk drives. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A processing unit for improving fault tolerance of a storage system, the storage system comprising a first set of disk drives storing data and a second set of disk drives, the processing unit comprising:
-
a. a drive replacement logic unit, the drive replacement logic unit identifying a failing disk drive from the first set of disk drives; and
b. a drive control unit, the drive control unit receiving an indication from the drive replacement logic unit to replace the failing disk drive with a spare disk drive from the second set of disk drives, the drive control unit powering-on the spare disk drive to replace the failing disk drive. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
24. A method for improving fault tolerance of a storage system, the storage system comprising a first set of disk drives and a second set of disk drives in power-off condition, the method comprising the steps of:
-
a. monitoring the first set of disk drives to identify a failing disk drive from the first set of disk drives;
b. powering-on a spare disk drive from the second set of disk drives on receipt of signal to replace the failing disk drive from the first set of disk drives; and
c. copying data from the failing disk drive from the first set of disk drives to the spare disk drive from the second set of disk drives. - View Dependent Claims (23, 25, 26, 27, 28, 29, 30)
-
Specification