Selective automated power cycling of faulty disk in intelligent disk array enclosure for error recovery
First Claim
Patent Images
1. An error recovery method for recovering from disk errors in a disk array storage system having an array of disk drives adapted to store data, the method comprising the steps of:
- identifying a faulty disk drive in said array that exhibits an error condition in which said drive fails to perform a requested operation;
determining whether automated selective disk drive power cycling is supported;
upon said determining producing a negative result, marking said faulty disk drive offline;
upon said determining producing a positive result, selectively power cycling said disk drive while maintaining power to other disk drives in said array; and
retrying said requested operation following said power cycling sequence.
2 Assignments
0 Petitions
Accused Products
Abstract
A disk array storage system and error recovery method wherein recovery from disk errors is achieved using automated selective power cycling. Initially, identification is made of a faulty disk drive in the array that exhibits an error condition in which the drive fails to perform a requested operation. The faulty disk drive is selectively power cycled while power to other disk drives in the array is maintained. Following the power cycling sequence, the requested operation is retried.
60 Citations
43 Claims
-
1. An error recovery method for recovering from disk errors in a disk array storage system having an array of disk drives adapted to store data, the method comprising the steps of:
-
identifying a faulty disk drive in said array that exhibits an error condition in which said drive fails to perform a requested operation;
determining whether automated selective disk drive power cycling is supported;
upon said determining producing a negative result, marking said faulty disk drive offline;
upon said determining producing a positive result, selectively power cycling said disk drive while maintaining power to other disk drives in said array; and
retrying said requested operation following said power cycling sequence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 38)
-
-
21. A disk array storage system having an array of disk drives adapted to store data, said system comprising:
-
means for identifying a faulty disk drive in said array that exhibits an error condition in which said drive fails to perform a requested operation;
means for determining whether automated selective disk drive power cycling is supported;
means responsive to said determining producing a negative result for marking said faulty disk drive offline;
means responsive to said determining producing a positive result for selectively power cycling said disk drive while maintaining power to other disk drives in said array; and
means for retrying said requested operation following said power cycling sequence. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 39, 40)
-
-
41. A computer program product for performing error recovery in a disk array storage system having an array of disk drives adapted to store data, comprising:
-
one or more data storage media for holding program instructions;
program instruction means stored on said media for identifying a faulty disk drive in said array that exhibits an error condition in which said drive fails to perform a requested operation;
program instruction means recorded on said media for determining whether automated selective disk drive power cycling is supported;
program instruction means recorded on said media responsive to said determining producing a negative result for marking said faulty disk drive offline;
program instruction means stored on said media responsive to said determining producing a positive result for selectively power cycling said disk drive while maintaining power to other disk drives in said array; and
program instruction means stored on said media for retrying said requested operation following said power cycling sequence.
-
-
42. A host adapter adapted for communication with a host computer and managing at least one intelligent enclosure containing an enclosure processor and an array of disk drives, the processor being configured to independently control power supplied to said disk drives, said host adapter being configured to perform an error recovery procedure comprising the steps of:
-
attempting one or more times without success to have a disk drive in the intelligent enclosure perform a requested operation;
determining whether said disk drive is a candidate for automated selective disk drive power cycling;
querying said enclosure processor to determine whether said enclosure processor supports automated selective disk drive power cycling;
upon said determining producing a negative result, marking said faulty disk drive offline;
upon said determining producing a positive result, determining whether a cumulative power cycling threshold for said disk drive has been reached;
determining whether a power cycling threshold for said disk drive has been reached;
issuing a first read command to said enclosure processor to verify that said disk drive is in a power on state;
issuing a first write command to said enclosure processor to cycle said disk drive to a power off state while maintaining other disk drives in said enclosure in a power on state;
issuing a second read command to said enclosure processor to verify that said disk drive is in a power off state;
issuing a second write command to said enclosure processor to cycle said disk drive to a power on state;
issuing a third read command to said enclosure processor to verify that said disk drive is in a power on state; and
re-attempting to have said disk drive perform said requested operation.
-
-
43. An intelligent enclosure adapted for communication with a host adapter and containing an enclosure processor controlling power to an array of disk drives, said intelligent enclosure being configured to perform an error recovery procedure comprising the steps of:
-
attempting one or more times without success to have a disk drive in the intelligent enclosure perform a requested operation;
responding to a query from said host adapter seeking to determine whether said enclosure processor supports automated selective disk drive power cycling;
upon said response being negative, advising said host adaptor of said status so that said disk drive can be marked offline by said host adaptor;
upon said response being positive, responding to a first read command from said host adapter seeking to verify that said disk drive is in a power on state;
responding to a first write command from said host adapter seeking to cycle said disk drive to a power off state while maintaining other disk drives in said enclosure in a power on state;
responding to a second read command from said host adapter seeking to verify that said disk drive is in a power off state;
responding to a second write command from said adapter seeking to cycle said disk drive to a power on state; and
responding to a third read command from said adapter seeking to verify that said disk drive is in a power on state.
-
Specification