×

System and method for using failure casting to manage failures in computer systems

  • US 8,359,495 B2
  • Filed: 03/27/2007
  • Issued: 01/22/2013
  • Est. Priority Date: 03/27/2007
  • Status: Active Grant
First Claim
Patent Images

1. A system for managing failures in a computer system using failure casting, the computer system including an array of disks, comprising:

  • one or more processors operable to provide a system manager that performs actions on the computer system to address failures that occur within the computer system;

    a failure casting logic that detects failures as they occur in the computer system;

    a failure casting hierarchy that defines a plurality of failures that can occur within the computer system, and which is used by the failure casting logic upon detecting the occurrence of a failure to cast the failure from a first failure type to a second failure type, wherein the second failure type is then communicated to the system manager to allow the system manager to treat the failure as if it were the second failure type;

    wherein the failure casting hierarchy defines at least two sets of failures, including a set of reboot-curable failures and a set of non-reboot-curable failures, wherein the reboot-curable failures are addressed by the system manager by rebooting the computer system or component thereof that includes the failure;

    wherein the failure casting logic and the failure casting hierarchy are part of a script that detects the occurrence of failures in the computer system and then casts the failure into one of either a reboot-curable failure or non-reboot-curable failure, wherein the script is executed by the computer system when powered on;

    wherein the script is used to address failures within the array of disks at boot time by verifying the health of each disk prior to adding a disk to the array; and

    wherein the failure casting hierarchy in the script includes the set of non-reboot curable failures that are checked at boot time, and if a disk, when added to the array, exhibits a failure upon bootup within the set of non-reboot-curable failures, then the disk is not added to the array.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×