Efficient system for predicting and processing storage subsystem failure
First Claim
1. A method for operating a storage system including a supervising processor coupled to a storage subsystem having multiple head disk assemblies ("HDA") each including an HDA controller and at least one storage medium, said method comprising the steps of:
- the supervising processor receiving notice of predetermined types of data access errors occurring in the storage subsystem;
the supervising processor recording representations of the errors in an error log; and
for each HDA, the supervising processor performing steps comprising;
performing a first predictive failure analysis ("PFA") to determine whether errors associated with the HDA have a selected characteristic; and
if the errors associated with the HDA have the selected characteristic, directing the HDA to perform a second PFA to predict future failure of the at least one storage medium of the HDA.
0 Assignments
0 Petitions
Accused Products
Abstract
Predictive failure analysis of a storage subsystem is efficiently conducted and data quickly recovered from a failed Read operation. This may be implemented in a storage system including a host coupled to a supervising processor that couples to a parity-equipped RAID storage subsystem having multiple HDAs each including an HDA controller and at least one storage medium. In one embodiment, when an HDA experiences an error during a Read attempt, the HDA transmits a recovery alert signal to the supervising processor; then, the processor and HDA begin remote and local recovery processes in parallel. The first process to complete provides the data to the host, and the second process is aborted. In another embodiment, an HDA'"'"'s PFA operations are restricted to idle times of the HDA. A different embodiment limits HDA performance of PFA to times when the processor is conducting data reconstruction. Another embodiment monitors HDA errors at the supervisory processor level, initiating an HDA'"'"'s PFA operations when errors at that HDA have a certain characteristic, such as a predetermined frequency of occurrence.
-
Citations
22 Claims
-
1. A method for operating a storage system including a supervising processor coupled to a storage subsystem having multiple head disk assemblies ("HDA") each including an HDA controller and at least one storage medium, said method comprising the steps of:
-
the supervising processor receiving notice of predetermined types of data access errors occurring in the storage subsystem; the supervising processor recording representations of the errors in an error log; and for each HDA, the supervising processor performing steps comprising; performing a first predictive failure analysis ("PFA") to determine whether errors associated with the HDA have a selected characteristic; and if the errors associated with the HDA have the selected characteristic, directing the HDA to perform a second PFA to predict future failure of the at least one storage medium of the HDA. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A data storage medium tangibly embodying a machine-readable program of instructions for operating a storage system, the storage system including a supervising processor coupled to a storage subsystem having multiple head disk assemblies ("HDA"), each HDA including an HDA controller and at least one storage medium, said program of instructions for causing the supervising processor to operate the storage system by performing a method comprising the steps of:
-
the supervising processor receiving notice of predetermined types of data access errors occurring in the storage subsystem; the supervising processor recording representations of the errors in an error log; and for each HDA, the supervising processor performing steps comprising; performing a first predictive failure analysis ("PFA") to determine whether errors associated with the HDA have a selected characteristic; and if the errors associated with the HDA have the selected characteristic, directing the HDA to perform a second PFA to predict future failure of the at least one storage medium of the HDA. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification