Drive extent based end of life detection and proactive copying in a mapped RAID (redundant array of independent disks) data storage system
First Claim
1. A method of providing RAID (Redundant Array of Independent Disks) data protection for at least one storage object in a data storage system, wherein the data storage system includes at least one storage processor and an array of data storage drives communicably coupled to the storage processor, the method comprising:
- generating a RAID mapping table, wherein the RAID mapping table contains a plurality of RAID extent entries, wherein each RAID extent entry contained in the RAID mapping table indicates a predetermined total number of drive extents that each persistently store host data written to a corresponding one of a plurality of RAID extents within a logical address space that is mapped to the at least one storage object, wherein each drive extent comprises a unique contiguous region of non-volatile data storage located on one of the data storage drives, and wherein each one of the data storage drives has multiple drive extents located thereon;
for each I/O operation directed to the storage object, performing a monitoring operation by the storage processor, wherein the monitoring operation includes;
i) incrementing a total I/O operations counter corresponding to a target drive extent to which that I/O operation is directed, wherein the total I/O operations counter corresponding to the target drive extent stores a total number of I/O operations that have been directed to the target drive extent,ii) receiving, from a data storage drive within which the target drive extent is located, a completion status for that I/O operation, andiii) in response to detecting that the received completion status for that I/O operation indicates that a soft media error occurred within the data storage drive while performing that I/O operation on the target drive extent;
a) incrementing a soft media error counter corresponding to the target drive extent, wherein the soft media error counter corresponding to the target drive extent stores a total number of soft media errors that have occurred while performing I/O operations on the target drive extent,b) calculating an error ratio for the target drive extent, wherein the error ratio for the target drive extent comprises a ratio of a current value of the soft media error counter corresponding to the target drive extent to a current value of the total I/O operations counter corresponding to the target drive extent, andc) in response to detecting that the error ratio for the target drive extent exceeds a threshold error ratio, performing a proactive copy operation on the target drive extent that copies all host data stored on the target drive extent to a newly allocated drive extent, wherein performing the proactive copy operation on the target drive extent also modifies a RAID extent entry in the RAID mapping table that stored an indication of the target drive extent to store an indication of the newly allocated drive extent, whereby the host data copied from the target drive extent to the newly allocated drive extent is accessed by subsequently received I/O operations on the newly allocated drive extent.
7 Assignments
0 Petitions
Accused Products
Abstract
Mapped RAID (Redundant Array of Independent Disks) technology divides individual drives into multiple drive extents, allocates the drive extents to RAID extent entries in a RAID mapping table, and performs “end of life” detection and proactive copying of data between data storage drives on a per drive extent basis. A given drive extent is determined to be “end of life” when the ratio of soft media errors to total I/O operations for the drive extent exceeds a threshold error ratio. Data stored on the drive extent is then proactively copied to a newly allocated drive extent, the RAID mapping table is modified so that the data is subsequently accessed from the newly allocated drive extent, and the drive extent is excluded from being used again to store host data. As a result, the rate at which the drives experience soft media errors is slowed, lengthening their effective life.
-
Citations
19 Claims
-
1. A method of providing RAID (Redundant Array of Independent Disks) data protection for at least one storage object in a data storage system, wherein the data storage system includes at least one storage processor and an array of data storage drives communicably coupled to the storage processor, the method comprising:
-
generating a RAID mapping table, wherein the RAID mapping table contains a plurality of RAID extent entries, wherein each RAID extent entry contained in the RAID mapping table indicates a predetermined total number of drive extents that each persistently store host data written to a corresponding one of a plurality of RAID extents within a logical address space that is mapped to the at least one storage object, wherein each drive extent comprises a unique contiguous region of non-volatile data storage located on one of the data storage drives, and wherein each one of the data storage drives has multiple drive extents located thereon; for each I/O operation directed to the storage object, performing a monitoring operation by the storage processor, wherein the monitoring operation includes; i) incrementing a total I/O operations counter corresponding to a target drive extent to which that I/O operation is directed, wherein the total I/O operations counter corresponding to the target drive extent stores a total number of I/O operations that have been directed to the target drive extent, ii) receiving, from a data storage drive within which the target drive extent is located, a completion status for that I/O operation, and iii) in response to detecting that the received completion status for that I/O operation indicates that a soft media error occurred within the data storage drive while performing that I/O operation on the target drive extent; a) incrementing a soft media error counter corresponding to the target drive extent, wherein the soft media error counter corresponding to the target drive extent stores a total number of soft media errors that have occurred while performing I/O operations on the target drive extent, b) calculating an error ratio for the target drive extent, wherein the error ratio for the target drive extent comprises a ratio of a current value of the soft media error counter corresponding to the target drive extent to a current value of the total I/O operations counter corresponding to the target drive extent, and c) in response to detecting that the error ratio for the target drive extent exceeds a threshold error ratio, performing a proactive copy operation on the target drive extent that copies all host data stored on the target drive extent to a newly allocated drive extent, wherein performing the proactive copy operation on the target drive extent also modifies a RAID extent entry in the RAID mapping table that stored an indication of the target drive extent to store an indication of the newly allocated drive extent, whereby the host data copied from the target drive extent to the newly allocated drive extent is accessed by subsequently received I/O operations on the newly allocated drive extent. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A data storage system that provides RAID (Redundant Array of Independent Disks) data protection for a storage object, comprising:
-
at least one storage processor including processing circuitry and a memory; an array of data storage drives communicably coupled to the at least one storage processor; and wherein the memory has program code stored thereon, wherein the program code, when executed by the processing circuitry, causes the processing circuitry to; generate a RAID mapping table, wherein the RAID mapping table contains a plurality of RAID extent entries, wherein each RAID extent entry contained in the RAID mapping table indicates a predetermined total number of drive extents that each persistently store host data written to a corresponding one of a plurality of RAID extents within a logical address space that is mapped to the storage object, wherein each drive extent comprises a unique contiguous region of non-volatile data storage located on one of the data storage drives, and wherein each one of the data storage drives has multiple drive extents located thereon; for each I/O operation directed to the storage object, perform a monitoring operation by the storage processor, at least in part by causing the processing circuitry to; i) increment a total I/O operations counter corresponding to a target drive extent to which that I/O operation is directed, wherein the total I/O operations counter corresponding to the target drive extent stores a total number of I/O operations that have been directed to the target drive extent, ii) receive, from a data storage drive within which the target drive extent is located, a completion status for that I/O operation, and iii) in response to detecting that the received completion status for that I/O operation indicates that a soft media error occurred within the data storage drive while performing that I/O operation on the target drive extent; a) increment a soft media error counter corresponding to the target drive extent, wherein the soft media error counter corresponding to the target drive extent stores a total number of soft media errors that have occurred while performing I/O operations on the target drive extent, b) calculate an error ratio for the target drive extent, wherein the error ratio for the target drive extent comprises a ratio of a current value of the soft media error counter corresponding to the target drive extent to a current value of the total I/O operations counter corresponding to the target drive extent, and c) in response to detecting that the error ratio for the target drive extent exceeds a threshold error ratio, perform a proactive copy operation on the target drive extent that copies all host data stored on the target drive extent to a newly allocated drive extent, wherein performing the proactive copy operation on the target drive extent also modifies a RAID extent entry in the RAID mapping table that stored an indication of the target drive extent to store an indication of the newly allocated drive extent, whereby the host data copied from the target drive extent to the newly allocated drive extent is accessed by subsequently received I/O operations on the newly allocated drive extent. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A non-transitory computer readable medium for providing RAID (Redundant Array of Independent Disks) data protection for a storage object in a data storage system, wherein the data storage system includes a storage processor and an array of data storage drives communicably coupled to the storage processor, the non-transitory computer readable medium comprising instructions stored thereon that when executed on processing circuitry in the storage processor perform the steps of:
-
generating a RAID mapping table, wherein the RAID mapping table contains a plurality of RAID extent entries, wherein each RAID extent entry contained in the RAID mapping table indicates a predetermined total number of drive extents that each persistently store host data written to a corresponding one of a plurality of RAID extents within a logical address space that is mapped to the storage object, wherein each drive extent comprises a unique contiguous region of non-volatile data storage located on one of the data storage drives, and wherein each one of the data storage drives has multiple drive extents located thereon; for each I/O operation directed to the storage object, performing a monitoring operation by the storage processor, wherein the monitoring operation includes; i) incrementing a total I/O operations counter corresponding to a target drive extent to which that I/O operation is directed, wherein the total I/O operations counter corresponding to the target drive extent stores a total number of I/O operations that have been directed to the target drive extent, ii) receiving, from a data storage drive within which the target drive extent is located, a completion status for that I/O operation, and iii) in response to detecting that the received completion status for that I/O operation indicates that a soft media error occurred within the data storage drive while performing that I/O operation on the target drive extent; a) incrementing a soft media error counter corresponding to the target drive extent, wherein the soft media error counter corresponding to the target drive extent stores a total number of soft media errors that have occurred while performing I/O operations on the target drive extent, b) calculating an error ratio for the target drive extent, wherein the error ratio for the target drive extent comprises a ratio of a current value of the soft media error counter corresponding to the target drive extent to a current value of the total I/O operations counter corresponding to the target drive extent, and c) in response to detecting that the error ratio for the target drive extent exceeds a threshold error ratio, performing a proactive copy operation on the target drive extent that copies all host data stored on the target drive extent to a newly allocated drive extent, wherein performing the proactive copy operation on the target drive extent also modifies a RAID extent entry in the RAID mapping table that stored an indication of the target drive extent to store an indication of the newly allocated drive extent, whereby the host data copied from the target drive extent to the newly allocated drive extent is accessed by subsequently received I/O operations on the newly allocated drive extent.
-
Specification