Error processing across multiple initiator network
First Claim
1. A computer program product comprising a computer recordable storage medium having a computer readable program recorded thereon, wherein the computer readable program, when executed on a computing device, causes the computing device to:
- receive an error event message indicating an error event in a first software stack within a plurality of software stacks in a network;
determine a priority of the error event;
assign an error recovery procedure for the error event to a second software stack within the plurality of software stacks in the network based on the priority of the error event; and
run the error recovery procedure in the second software stack,wherein assigning the error recovery procedure comprises identifying a host to be in control of error processing for the received error event message; and
wherein identifying the host to be in control of error processing for the received error event message comprises;
determining in a local host whether the local host already has a lock for a current error event;
if the local host does not already have the lock for the current error event, obtaining the lock for a new error event and running the error recovery procedure for the new error event in the local host;
if the local host already has the lock for the current error event, determining whether the received error event has a higher priority than the current error event; and
if the received error event does not have a higher priority than the current error event, continuing with the error recovery procedure for the current error event.
1 Assignment
0 Petitions
Accused Products
Abstract
An architecture for error log processing is provided. Each error log is given a defined priority and mapped to an error recovery procedure (ERP) to be run if the log is seen. The system has a plurality of software layers to process the errors. Each software layer processes the error independently. Errors are reported to a higher software stack when error recovery fails from the lower stack ERPs and recovery is non-transparent. If the system host identified for error processing fails, the control of the ERP is transferred during the failover process. Non-obvious failed component isolating ERPs are grouped to be run together to assist in isolating the failed component. Prioritization of the error systems may be based on a plurality of criteria. ERPs are assigned to run within a particular software stack.
-
Citations
14 Claims
-
1. A computer program product comprising a computer recordable storage medium having a computer readable program recorded thereon, wherein the computer readable program, when executed on a computing device, causes the computing device to:
-
receive an error event message indicating an error event in a first software stack within a plurality of software stacks in a network; determine a priority of the error event; assign an error recovery procedure for the error event to a second software stack within the plurality of software stacks in the network based on the priority of the error event; and run the error recovery procedure in the second software stack, wherein assigning the error recovery procedure comprises identifying a host to be in control of error processing for the received error event message; and wherein identifying the host to be in control of error processing for the received error event message comprises; determining in a local host whether the local host already has a lock for a current error event; if the local host does not already have the lock for the current error event, obtaining the lock for a new error event and running the error recovery procedure for the new error event in the local host; if the local host already has the lock for the current error event, determining whether the received error event has a higher priority than the current error event; and if the received error event does not have a higher priority than the current error event, continuing with the error recovery procedure for the current error event. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A data processing system in a storage area network, comprising:
-
a processor; and a memory, wherein the memory contains instruction which, when executed by the processor, cause the processor to; receive an error event message indicating an error event in a first software stack within a plurality of software stacks in the network; determine a priority of the error event; assign an error recovery procedure for the error event to a second software stack within the plurality of software stacks in the network based on the priority of the error event; and run the error recovery procedure in the second software stack, wherein the storage area network comprises; a plurality of drives running a drive software stack; one or more switches connected to the plurality of drives, wherein the one or more switches run a switch software stack; one or more controllers connected to the one or more switches, wherein the one or more controllers run a controller software stack; one or more initiators connected to the one or more controllers, wherein the one or more initiators run an initiator software stack; and one or more hosts connected to the one or more initiators, wherein the one or more hosts run a system software stack; and wherein the data processing system is a local host within the one or more hosts and wherein the memory contains instructions which, when executed by the processor, cause the processor to; determine whether the local host already has a lock for a current error event; if the local host does not already have a lock for a current error event, obtain a lock for the new error event and running the error recovery procedure for the new error event in the local host; if the local host already has a lock for a current error event, determine whether the received error event has a higher priority than the current error event; and if the received error has a higher priority than the current error event, stop the error recovery procedure for the current error event and running an error recovery procedure for the received error recovery procedure. - View Dependent Claims (9)
-
-
10. A method for error processing across a multiple initiator network, the method comprising:
-
receiving an error event message indicating an error event in a first software stack within a plurality of software stacks in the network; determining a priority of the error event; assigning an error recovery procedure for the error event to a second software stack within the plurality of software stacks in the network based on the priority of the error event; and running the error recovery procedure in the second software stack, wherein assigning the error recovery procedure comprises identifying a host to be in control of error processing for the received error event message; and wherein identifying a host to be in control of error processing for the received error event comprises; determining in a local host whether the local host already has a lock for a current error event; if the local host does not already have a lock for a current error event, obtaining a lock for the new error event and running the error recovery procedure for the new error event in the local host; if the local host already has a lock for a current error event, determining whether the received error event has a higher priority than the current error event; and if the received error event has a higher priority than the current error event, stopping the error recovery procedure for the current error event and running an error recovery procedure for the received error recovery procedure. - View Dependent Claims (11, 12, 13, 14)
-
Specification