STORAGE SYSTEM AND MEMORY DEVICE FAULT RECOVERY METHOD
First Claim
Patent Images
1. A storage system coupled to a host computer, the storage system comprising:
- a controller;
a memory;
a plurality of data storage devices for storing data sent from the host computer; and
one or more spare storage devices to be used for replacing the data storage devices;
wherein two or more of said data storage devices constitute a RAID group;
and when it is determined that the data storage device is to be blocked due to failure, the controllerrecords instruction data indicating a region of data stored in the spare storage device until the blocked data storage device is recovered; and
executes a failure recovery processing corresponding to a content of failure and a predetermined check processing to the data storage device, by writing the data stored in the region of the spare storage device indicated by the instruction data back to the blocked data storage device, at a time point when the blocked data storage device has recovered.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention aims at providing a storage system capable of shortening the recovery time from failure while ensuring the reliability of data when failure occurs to a storage device. When failure occurs to a storage device, a recovery processing corresponding to the content of failure is executed for the blocked storage device. The storage device recovered via the execution of the recovery processing is subjected to a check corresponding to the operation status of the storage system or the failure history of the storage device.
-
Citations
14 Claims
-
1. A storage system coupled to a host computer, the storage system comprising:
-
a controller; a memory; a plurality of data storage devices for storing data sent from the host computer; and one or more spare storage devices to be used for replacing the data storage devices; wherein two or more of said data storage devices constitute a RAID group; and when it is determined that the data storage device is to be blocked due to failure, the controller records instruction data indicating a region of data stored in the spare storage device until the blocked data storage device is recovered; and executes a failure recovery processing corresponding to a content of failure and a predetermined check processing to the data storage device, by writing the data stored in the region of the spare storage device indicated by the instruction data back to the blocked data storage device, at a time point when the blocked data storage device has recovered. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A failure recovery method of a storage device, comprising:
-
storing data from a host computer to a data storage device, and constituting a RAID group by two or more said data storage devices; wherein when it is determined that the data storage device is to be blocked due to failure, the method further comprises recording instruction data indicating a region of data stored in the spare storage device until the blocked data storage device is recovered; and executing a failure recovery processing corresponding to a content of failure and a predetermined check processing to the data storage device, by writing the data stored in the region of the spare storage device indicated by the instruction data back to the blocked data storage device, at a time point when the blocked data storage device has recovered. - View Dependent Claims (13)
-
-
14. A storage system coupled to a host computer and a maintenance terminal, the storage system comprising:
-
a controller; a memory; a plurality of data storage devices for storing data sent from the host computer; and one or more spare storage devices to be used for replacing the data storage devices; wherein two or more of said data storage devices constitute a RAID group; and when it is determined that a data storage device is to be blocked due to failure, the controller records instruction data indicating a region of data stored in the spare storage device until the blocked data storage device is recovered; and executes a failure recovery processing corresponding to a content of failure and a predetermined check processing to the data storage device, by writing the data stored in the region of the spare storage device indicated by the instruction data back to the blocked data storage device, at a time point when the blocked data storage device has recovered; wherein the failure recovery processing is one or more of the following operations executed by the controller; (a1) power OFF/ON; (a2) hardware reset; (a3) motor stop and restart; (a4) initialization of storage area; (a5) move of the storage area to read section; and (a6) reading/writing of the storage area; wherein the check processing is one of the following processes; (b1) reading of data of the whole storage area; (b2) writing of data of the whole storage area; (b3) reading of data and writing of data of the whole storage area; (b4) reading of data of a predetermined time to the storage area; (b5) writing of data of a predetermined time to the storage area; (b6) writing of data and reading of data of a predetermined time to the storage area; (b7) writing of data and reading of data of the whole storage area, and comparing of write data and read data;
or(b8) writing of data and reading of data of a predetermined time to the storage area, and comparing of the write data and the read data; wherein the controller stores a number of times of execution of recovery and check in which the failure recovery processing and the check processing have been executed for each data storage device in the memory; determines a threshold value based on the presence or absence of redundancy at the time failure occurs, and based on a storage time of all stored data of the data storage device where failure has occurred to the spare storage device; does not execute the failure recovery processing and check processing if the number of times of execution of recovery and check exceeds the threshold value; determines a type of the check processing based on a combination of two or more of the following;
presence or absence of redundancy, storage time, or I/O access type;determines a permitted number of times of failure for each failure type by the check processing according to the number of times of execution of recovery and check processing; if the number of times of occurrence of failure occurred by the check processing is smaller than the permitted number of times of failure, cancels the blockage of the data storage device in the blocked state; and when the data storage device where failure has occurred has been recovered by the failure recovery processing and check processing, the controller switches a storage destination of the regenerated data from the spare storage device to the recovered data storage device.
-
Specification