Error identification and handling in storage area networks
First Claim
1. A computer implemented method for handling error events relating to a storage area network, the method comprising:
- receiving an error event at a first hardware component of the storage area network;
using, in response to the error event, a first event handling module operating on the first hardware component to;
access a database containing associations between error event data and potential sources of errors;
identify a plurality of hardware components based on similarities between the associations in the database and error event data contained in the error event;
generate a ranking for the plurality of hardware components;
select, based on the ranking, a particular hardware component from the plurality of hardware components;
transmit an error notification to a second event handling module of the particular hardware component of the plurality of hardware components;
monitor a response, of the second event handling module to the error notification;
carry out an error handling procedure based on the response;
perform a self test of the particular hardware component using the second event handling modulecomparing a first version identifier of the database to a second version identifier in order to identify an update to the database; and
applying, in response to a mismatch between the first and second version identifiers, the update to the database.
1 Assignment
0 Petitions
Accused Products
Abstract
Storage area network (SAN) components contain a processor configured to provide a first event handling module that can receive an error event at a first hardware component of the storage area network. A database is accessed that contains associations between error event data and potential sources of errors. A plurality of hardware components are identified using the database and error event data. The hardware components are ranked and one is selected based on the ranking. An error notification is sent to a second event handling module of the hardware component. Based upon the response of the second event handling module, an error handling procedure is carried out.
-
Citations
19 Claims
-
1. A computer implemented method for handling error events relating to a storage area network, the method comprising:
-
receiving an error event at a first hardware component of the storage area network; using, in response to the error event, a first event handling module operating on the first hardware component to; access a database containing associations between error event data and potential sources of errors; identify a plurality of hardware components based on similarities between the associations in the database and error event data contained in the error event; generate a ranking for the plurality of hardware components; select, based on the ranking, a particular hardware component from the plurality of hardware components; transmit an error notification to a second event handling module of the particular hardware component of the plurality of hardware components; monitor a response, of the second event handling module to the error notification; carry out an error handling procedure based on the response; perform a self test of the particular hardware component using the second event handling module comparing a first version identifier of the database to a second version identifier in order to identify an update to the database; and applying, in response to a mismatch between the first and second version identifiers, the update to the database. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
a first storage area network (SAN) hardware component, the first SAN hardware component being of a type selected from the group consisting of a SAN controller, a storage controller, a storage device, and a network fabric switch, the first SAN hardware component having a processor configured to provide a first event handling module that is configured to; receive an error event related to an error at one or more hardware components of the SAN; access a database containing associations between error event data and potential sources of errors; identify a plurality of additional SAN hardware components of the SAN based on similarities between the associations in the database and error event data contained in the error event, each additional SAN hardware component being of a type selected from the group consisting of a SAN controller, a storage controller, a storage device, and a network fabric switch; generate a ranking for the plurality of additional SAN hardware components; select, based on the ranking, a particular additional SAN hardware component from the plurality of additional SAN hardware components; transmit an error notification to a second event handling module of the particular additional SAN hardware component of the plurality of additional SAN hardware components, the error notification instructing the second event handling module to access a second database containing associations between error event data and potential sources of errors, the error notification further instructing the second event handling module to perform testing and analysis of the SAN based on the second database, and the error notification further instructing the second event handling module to respond with a response that contains the result of the performed testing and analysis; receive the response of the second event handling module of the particular additional SAN hardware component, to the error notification; and carry out an error handling procedure based on the response. - View Dependent Claims (8, 9, 10, 11, 12)
-
13. A computer program product for handling events relating to a storage area network (SAN), the computer program product comprising a non-transitory computer readable storage medium having program code embodied therewith, the program code readable/executable by a computer to perform a method comprising:
-
receiving, by a first event handling module of a first SAN hardware component of the SAN, a request to investigate an event, the event related to a potential problem with the SAN; analyzing, by the first event handling module and based on the received request, whether the first event handling module can directly correct the potential problem; determining that the first event handling module is unable to directly correct the potential problem; accessing, by the first event handling module and in response to the determination, a first database containing associations between event data and potential sources of errors; selecting, by the first event handling module and based on the first database and based on the received request, a second event handling module of a second SAN hardware component of the SAN potentially capable of correcting the potential problem; transmitting, by the first event handling module and to the second event handling module, a message that instructs the second event handling module to attempt to correct the potential problem; receiving, by the first event handling module, the results of the attempt of the second event handling module; and carry out, by the first event handling module, an error handling procedure based on the results. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
Specification