System and method for reducing unrecoverable media errors in a disk subsystem
First Claim
1. A system for reassigning a sector encountering a recoverable error comprising:
- a reassignment utility for recognizing a recoverable error on a storage device in response to an I/O operation and providing a reassignment status indicator to a storage layer included in a storage operating system; and
in the storage layer, a mechanism that determines whether the storage device is in a degraded state and adapted to reassign a sector encountering the recoverable error to a spare sector if the storage device is not in a degraded state and to store the I/O operation in a log to reassign to the spare sector at a future time if the disk is in a degraded state.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method for reducing unrecoverable errors in a disk drive by undertaking an aggressive reassignment of slow-reading or currently recoverable-but-erroneous sectors to the spares pool is provided. A recovered error is treated by the system operating system as a fatal error and thereby the sectors involved are reassigned to the spares pool immediately. Reassignment is recommended by a reassignment utility at the disk interface level, which passes a status up to the RAID subsystem by which reassignment is performed. In order to prevent a double-disk panic, the RAID subsystem is instructed to ignore reassignment recommendations of this type (e.g. reassignment of recoverable errors) if the RAID group is currently operating in a degraded state. However, if the RAID group is undegraded, then immediate assignment of the sectors encountering the recoverable error is undertaken.
-
Citations
26 Claims
-
1. A system for reassigning a sector encountering a recoverable error comprising:
-
a reassignment utility for recognizing a recoverable error on a storage device in response to an I/O operation and providing a reassignment status indicator to a storage layer included in a storage operating system; and in the storage layer, a mechanism that determines whether the storage device is in a degraded state and adapted to reassign a sector encountering the recoverable error to a spare sector if the storage device is not in a degraded state and to store the I/O operation in a log to reassign to the spare sector at a future time if the disk is in a degraded state. - View Dependent Claims (2, 3, 4)
-
-
5. A method for reassigning a sector encountering a recoverable error on a disk, comprising:
-
recognizing, with a reassignment utility, a recoverable error on a disk in response to a disk I/O operation and providing a reassignment status indicator to a disk storage layer included in a storage operating system; and in the disk storage layer, determining whether the disk is in a degraded state and reassigning the sector encountering the recoverable error to a spare disk sector if the disk is not in a degraded state and storing the I/O operation in a log to reassign to the spare sector at a future time if the disk is in a degraded state. - View Dependent Claims (6, 7, 8)
-
-
9. A computer-readable medium for reassigning a sector encountering a recoverable error on a disk including program instructions for performing the steps of:
-
recognizing, with a reassignment utility, a sector encountering a recoverable error on a disk in response to a disk I/O operation and providing a reassignment status indicator to a disk storage layer included in a storage operating system; and in the disk storage layer, determining whether the disk is in a degraded state and reassigning the sector encountering the recoverable error to a spare disk sector if the disk is not in a degraded state and storing the I/O operation in a log to reassign to the spare sector at a future time if the disk is in a degraded state. - View Dependent Claims (10, 11, 12)
-
-
13. A method for reducing unrecoverable errors in a disk of a RAID, group comprising:
-
detecting a recoverable media error in an I/O operation; in response to the detecting, providing a recommendation to the RAID layer to reassign a sector of the disk encountering the recoverable media error to a spare sector in a pool of spare sectors on the disk of the RAID group; reassigning, by the RAID group, the sector encountering the recoverable media error if the RAID group is in a non-degraded state; and storing, by the RAID group, the recommendation in a log and not reassigning the recoverable media error if the RAID group is in a degraded state. - View Dependent Claims (14, 15, 16)
-
-
17. A storage system, including a computer system having a storage operating system and a plurality of storage devices managed by the storage operating system, the storage system comprising:
-
a storage adapter for receiving a first error signal from one or more of the storage devices indicative of a recoverable error in the one or more sectors of the storage device in response to an I/O operation, and a second error signal from one or more of the storage devices indicative of a non-recoverable error in one or more sectors of the storage device in response to an I/O operation; a reassignment utility for providing a reassignment status indicator to a storage layer included in a storage operating system in response to the first error signal and the second error signal; and a mechanism in the storage layer for selectively reassigning a sector encountering an error, including the non-recoverable error and the recoverable error, to a spare sector, reassigning the sector encountering the recoverable error to the spare sector if the storage device is not in a degraded state, and not reassigning the sector if the storage device encountering the recoverable error is in the degraded state. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A method for reassigning a sector encountering an error, comprising:
-
receiving an error signal from one or more sectors of a storage device; determining the error received in the error signal is recoverable; recommending, in response to determining the error is recoverable, reassignment to a disk storage layer; determining if a RAID subsystem is currently in a degraded state; reassigning, if the RAID system is not in the degraded state, the recovered error to one or more spare sectors; and storing, if the RAID system is in the degraded state, the recovered error in a log to reassign to the one or more spare sectors at a future time. - View Dependent Claims (23, 24, 25, 26)
-
Specification