Disk array device having spare disk drive and data sparing method
First Claim
1. A disk array device for storing data in response to communications from a host device, the disk array device comprising:
- a disk array control unit which performs control of the entire disk array device;
a host side data transfer control unit which controls data transfer to and from the host device;
a disk array including at least a plurality of data disk drives which constitute one parity group and one or more spare disk drives, wherein the one parity group has a large number of data stripes which are formed over storage areas of the plural data disk drives and the large number of data stripes can be partitioned into two or more sets of the data stripes;
a cache memory which is used for temporary storage of data to be transferred between the host device and the disk array; and
a subordinate side transfer control unit which controls data transfer to and from the disk array,wherein the disk array control unit comprises;
a prediction section which predicts the likelihood of occurrence of a failure for each data disk drive,a disk drive resource information table which includes for each data disk drive information indicating a status of said data disk drive and information of a rate of occurrence of errors in said data disk drive,a spare disk drive resource information table which includes for each spare disk drive information indicating a status of said spare disk drive and information regarding storage areas of said spare disk drive used for recovery with respect to a corresponding data disk drive,wherein said prediction section predicts the likelihood of occurrence of a failure based on information contained in said disk drive resource information table, anda divided data copy section which, in response to a prediction that occurrence of a failure is likely to occur with respect to a data disk drive, and based on information contain in said spare disk drive resource information table, selects two or more data disk drives out of the plural data disk drives as objects of divided data copy, selects two or more divided storage areas by selecting one divided storage area from each of the selected two or more data disk drives, the selected two or more divided storage areas belonging to different sets of the data stripes in the parity group, and controls the subordinate side transfer control unit and the cache memory so as to copy data in the selected two or more divided storage areas to the one or more spare disk drives.
6 Assignments
0 Petitions
Accused Products
Abstract
In a disk array of the RAID constitution, a likelihood of occurrence of a failure is predicted from the number of times of error occurrence for each of disk drives #0 to #3 in a certain parity group. Two storage areas #0_UH and #2_LH of a half size, which are different in data stripes from each other, are selected from two disk drives #0 and #2 having a relatively high likelihood of occurrence of a failure, respectively, and the data thereof is copied to a spare disk #A (divided data copy). When the likelihood of occurrence of a failure of one disk drive #0 further increases, the remaining half storage area #OLH of the disk drive #0 is copied to the spare disk #A (dynamic sparing). Dynamic sparing time is reduced and a likelihood of data lost due to a multiple disk failure by employing the divided data copy processing.
-
Citations
24 Claims
-
1. A disk array device for storing data in response to communications from a host device, the disk array device comprising:
-
a disk array control unit which performs control of the entire disk array device; a host side data transfer control unit which controls data transfer to and from the host device; a disk array including at least a plurality of data disk drives which constitute one parity group and one or more spare disk drives, wherein the one parity group has a large number of data stripes which are formed over storage areas of the plural data disk drives and the large number of data stripes can be partitioned into two or more sets of the data stripes; a cache memory which is used for temporary storage of data to be transferred between the host device and the disk array; and a subordinate side transfer control unit which controls data transfer to and from the disk array, wherein the disk array control unit comprises; a prediction section which predicts the likelihood of occurrence of a failure for each data disk drive, a disk drive resource information table which includes for each data disk drive information indicating a status of said data disk drive and information of a rate of occurrence of errors in said data disk drive, a spare disk drive resource information table which includes for each spare disk drive information indicating a status of said spare disk drive and information regarding storage areas of said spare disk drive used for recovery with respect to a corresponding data disk drive, wherein said prediction section predicts the likelihood of occurrence of a failure based on information contained in said disk drive resource information table, and a divided data copy section which, in response to a prediction that occurrence of a failure is likely to occur with respect to a data disk drive, and based on information contain in said spare disk drive resource information table, selects two or more data disk drives out of the plural data disk drives as objects of divided data copy, selects two or more divided storage areas by selecting one divided storage area from each of the selected two or more data disk drives, the selected two or more divided storage areas belonging to different sets of the data stripes in the parity group, and controls the subordinate side transfer control unit and the cache memory so as to copy data in the selected two or more divided storage areas to the one or more spare disk drives. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for use in a disk array device, which can be connected to a host device so that they are capable of communicating with each other, comprising:
-
a disk array control unit which performs control of the entire disk array device; a host side data transfer control unit which controls data transfer to and from the host device; a disk array including at least plural data disk drives which constitute one parity group and one or more spare disk drives, wherein the one parity group has a large number of data stripes which are formed over storage areas of the plural data disk drives and the large number of data stripes can be partitioned into two or more sets of the data stripes; a cache memory which is used for temporary storage of data to be transferred between the host device and the disk array; and a subordinate side transfer control unit which controls data transfer to and from the disk array, wherein said disk array control unit includes a disk drive resource information table which includes for each data disk drive information indicating a status of said data disk drive and information of a rate of occurrence of errors in said data disk drive and a spare disk drive resource information table which includes for each spare disk drive information indicating a status of said spare disk drive and information regarding storage areas of said spare disk drive used for recovery with respect to a corresponding data disk drive, wherein the disk array control unit operating to spare data in the data disk drive using the spare disk drive, the method comprising; a step of predicting the likelihood of occurrence of a failure for each of the disk drives based on information contained in said disk drive resource information table, a step of, in response to a prediction that occurrence of a failure is likely to occur with respect to a data disk drive and based on information contain in said spare disk drive resource information table, selecting two or more data disk drives as objects of divided data copy out of the plural data disk drives; a step of selecting two or more divided storage areas by selecting one divided storage area from each of the selected two or more data disk drives, wherein the selected two or more divided storage areas belong to different sets of data stripes in the parity group; and a step of performing the divided data copy by controlling the subordinate side transfer control unit and the cache memory so as to copy data of the selected two or more divided storage areas to the one or more spare disk drives. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A data sparing control device for use in a disk array device, which can be connected to a host device so that they are capable of communicating with each other, comprising:
-
a disk array control unit which performs control of the entire disk array device; a host side data transfer control unit which controls data transfer to and from the host device; a disk array including at least plural data disk drives which constitute one parity group and one or more spare disk drives, wherein the one parity group has a large number of data stripes which are formed over storage areas of the plural data disk drives and the large number of data stripes can be partitioned into two or more sets of the data stripes; a cache memory which is used for temporary storage of data to be transferred between the host device and the disk array; and a subordinate side transfer control unit which controls data transfer to and from the disk array, wherein for controlling an operation for sparing data in the data disk drives using the spare disk drives, the data sparing control device comprises; a prediction unit which predicts the likelihood of occurrence of a failure for each of the data disk drives, wherein said prediction section includes a disk drive resource information table which includes for each data disk drive information indicating a status of said data disk drive and information of a rate of occurrence of errors in said data disk drive and a spare disk drive resource information table which includes for each spare disk drive information indicating a status of said spare disk drive and information regarding storage areas of said spare disk drive used for recovery with respect to a corresponding data disk drive, wherein said prediction section predicts the likelihood of occurrence of a failure based on information contained in said disk drive resource information table, a divided area selection unit which, in response to a prediction that occurrence of a failure is likely to occur with respect to a data disk drive and based on information contain in said spare disk drive resource information table, selects two or more data disk drives as objects of divided data copy out of the plural data disk drives and selects two or more divided storage areas by selecting one divided storage area from each of the selected two or more data disk drives, wherein the selected two or more divided storage areas belong to different sets of data stripes in the parity group; and a divided data copy unit which controls the subordinate side transfer control unit and the cache memory so as to copy data of the selected two or more divided storage areas to the spare disk drives.
-
-
21. A disk array device for storing data in response to communications from a host device, the disk array device comprising:
-
a disk array control unit which performs control of the entire disk array device; a host side data transfer control unit which controls data transfer to and from the host device; a disk array including at least a plurality of data disk drives which constitute one parity group and one or more spare disk drives, wherein the one parity group has a large number of data stripes which are formed over storage areas of the plural data disk drives and the large number of data stripes can be partitioned into two or more sets of the data stripes; a cache memory which is used for temporary storage of data to be transferred between the host device and the disk array; and a subordinate side transfer control unit which controls data transfer to and from the disk array, wherein the disk array control unit comprises; a prediction section which predicts the likelihood of occurrence of a failure for each data disk drive, a disk drive resource information table which includes for each data disk drive information indicating a status of said data disk drive and information of a rate of occurrence of errors in said data disk drive, a spare disk drive resource information table which includes for each spare disk drive information indicating a status of said spare disk drive and information regarding storage areas of said spare disk drive used for recovery with respect to a corresponding data disk drive, wherein said prediction section predicts the likelihood of occurrence of a failure based on information contained in said disk drive resource information table, and a divided data copy section which, in response to a prediction that occurrence of a failure is likely to occur with respect to a data disk drive, and based on information contain in said spare disk drive resource information table, selects two or more data disk drives out of the plural data disk drives as objects of divided data copy, selects two or more divided storage areas by selecting one divided storage area from each of the selected two or more data disk drives, the selected two or more divided storage areas belonging to different sets of the data stripes in the parity group, and controls the subordinate side transfer control unit and the cache memory so as to copy data in the selected two or more divided storage areas to the one or more spare disk drives wherein when the probability of failure occurring in a first data disk drive of the plurality of data disk drives reaches a first level, the divided data copy section selects two or more data disk drives including the first data disk drive out of the plurality of data disk drives as objects of divided data copy, selects two or more divided storage areas by selecting one divided storage area from each of the selected two or more data storage drives, and then performs a divided copying process, wherein in the divided copying process, data stored in the divided storage areas of the selected two or more data disk drives are copied to one or more spare disks, but data stored in other storage areas of the selected two or more data disk drives are not copied to the spare disks, wherein data in each of the other storage areas of the selected two or more data disk drives include data that are not parity data, and wherein after performing the divided copying process, when the probability of the failure occurring in either one of the selected two or more data disk drives reaches a second level, which is higher than the first level, the divided data copy section copies the data stored in the other storage areas of the selected two or more data disk drives to the one or more spare disks. - View Dependent Claims (22, 23, 24)
-
Specification