×

Reducing concurrency bottlenecks while rebuilding a failed drive in a data storage system

  • US 10,210,045 B1
  • Filed: 04/27/2017
  • Issued: 02/19/2019
  • Est. Priority Date: 04/27/2017
  • Status: Active Grant
First Claim
Patent Images

1. A method of providing RAID (Redundant Array of Independent Disks) data protection for a storage object in a data storage system, wherein the data storage system includes a storage processor and a set of physical drives communicably coupled to the storage processor, the method comprising:

  • generating a RAID mapping table, wherein the RAID mapping table contains a plurality of RAID extents, wherein each RAID extent contained in the RAID mapping table indicates a plurality of drive extents for storing host data written to the storage object and related parity information, and wherein each drive extent comprises a contiguous region of non-volatile data storage in one of the physical drives;

    in response to detecting that one of the physical drives has failed, concurrently rebuilding RAID extents in a concurrent rebuild list, wherein each RAID extent in the concurrent rebuild list indicates a drive extent of the failed one of the physical drives, and wherein for each one of the RAID extents in the concurrent rebuild list rebuilding includes i) recovering host data previously stored in the drive extent of the failed one of the physical drives indicated by the RAID extent, and ii) writing the recovered host data to a spare drive extent allocated to the RAID extent;

    in response to detecting that rebuilding of one of the RAID extents in the concurrent rebuild list has completed, removing that one of the RAID extents from the concurrent rebuild list, and selecting a next RAID extent to replace the removed RAID extent in the concurrent rebuild list byi) forming a candidate set of RAID extents, wherein each RAID extent in the candidate set indicates a drive extent of the failed physical drive and has not been rebuilt,ii) calculating a relatedness score for each RAID extent in the candidate set with respect to the RAID extents remaining in the concurrent rebuild list, wherein the relatedness score indicates an amount of limitation with regard to concurrently rebuilding the RAID extent in combination with the RAID extents remaining in the concurrent rebuild list, andiii) selecting as the new RAID extent to replace the removed RAID extent in the concurrent rebuild list a RAID extent in the candidate set having a lowest relatedness score of the RAID extents in the candidate set.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×