System, method and program to identify failed components in storage area network
First Claim
1. Apparatus for identifying a failed component in a system comprising application servers, storage servers and a switch fabric, said switch fabric having first ports coupled to said application servers to receive requests to access storage managed by said storage servers, second ports coupled to third ports of said storage servers to forward the access requests to said storage servers and receive responses from said storage servers, and internal switches to interconnect said first ports to second ports to pass said requests and responses through said switch fabric, said apparatus comprising:
- means for receiving from each of said application servers records of its own attempts to communicate with said storage servers via said switch fabric, each of said records indicating one of said third ports and one of said storage servers for each of said communication attempts; and
means for determining from said records if any of said communications was successful to one of said storage servers, andif not, determining that said one storage server may have failed, andif so, determining that said one storage server is active and determining from said records if any of said communications was successful to each of said third ports of said one storage server, andif not, determining that said each third port of said one storage server may have failed, andif so, determining that said each third port of said one storage server is active;
and wherein each of said records also indicates one of said first ports and one of said switches for said each communication attempt, and further comprising;
means for determining from said records if any of said communications was successful to each of said switches leading to said each third port of said one storage server, andif not, determining that said each switch may have failed, andif so, determining that said each switch is active and determining from said records if any of said communications was successful to each of said first ports connected to said each switch leading to said one storage server, andif not, determining that said each first port connected to said each switch leading to said one storage server or a connection between one of said application servers and said each first port connected to said each switch leading to said one storage server may have failed, andif so, determining that said each first port connected to said each switch leading to said one storage server is active.
1 Assignment
0 Petitions
Accused Products
Abstract
System, method and computer program product for identifying a failed component in a system comprising application servers, storage servers and a switch fabric. The switch fabric has first ports coupled to the application servers, second ports coupled to third ports of the storage servers and internal switches to interconnect the first ports to second ports. Each of the application servers compiles records of its own attempts to communicate with the storage servers via the switch fabric. Each of the records indicates one of the third ports and one of the storage servers for each of the communication attempts. From the records a determination is made if any of the communications was successful to one of the storage servers. If not, a determination is made that the one storage server may have failed. If any of the communications was successful to the one storage server, a determination is made that the one storage server is active and a determination is made from the records if any of the communications was successful to each of the third ports of the one storage server. If not, a determination is made that each of the third ports of the one storage server may have failed. If any of the communications was successful to each of the third ports, a determination is made that each of the third ports of the one storage server is active. According to a feature of the present invention, a determination is also made based on the records if a switch or first port has failed.
-
Citations
6 Claims
-
1. Apparatus for identifying a failed component in a system comprising application servers, storage servers and a switch fabric, said switch fabric having first ports coupled to said application servers to receive requests to access storage managed by said storage servers, second ports coupled to third ports of said storage servers to forward the access requests to said storage servers and receive responses from said storage servers, and internal switches to interconnect said first ports to second ports to pass said requests and responses through said switch fabric, said apparatus comprising:
-
means for receiving from each of said application servers records of its own attempts to communicate with said storage servers via said switch fabric, each of said records indicating one of said third ports and one of said storage servers for each of said communication attempts; and means for determining from said records if any of said communications was successful to one of said storage servers, and if not, determining that said one storage server may have failed, and if so, determining that said one storage server is active and determining from said records if any of said communications was successful to each of said third ports of said one storage server, and if not, determining that said each third port of said one storage server may have failed, and if so, determining that said each third port of said one storage server is active; and wherein each of said records also indicates one of said first ports and one of said switches for said each communication attempt, and further comprising; means for determining from said records if any of said communications was successful to each of said switches leading to said each third port of said one storage server, and if not, determining that said each switch may have failed, and if so, determining that said each switch is active and determining from said records if any of said communications was successful to each of said first ports connected to said each switch leading to said one storage server, and if not, determining that said each first port connected to said each switch leading to said one storage server or a connection between one of said application servers and said each first port connected to said each switch leading to said one storage server may have failed, and if so, determining that said each first port connected to said each switch leading to said one storage server is active. - View Dependent Claims (2, 3)
-
-
4. A computer program product for identifying a failed component in a system comprising application servers, storage servers and a switch fabric, said switch fabric having first ports coupled to said application servers to receive requests to access storage managed by said storage servers, second ports coupled to third ports of said storage servers to forward the access requests to said storage servers and receive responses from said storage servers, and internal switches to interconnect said first ports to second ports to pass said requests and responses through said switch fabric, said computer program product comprising:
-
a computer readable medium; first program instructions to receive from each of said application servers records of its own attempts to communicate with said storage servers via said switch fabric, each of said records indicating one of said third ports and one of said storage servers for each of said communication attempts; and second program instructions to determine from said records if any of said communications was successful to one of said storage servers, and if not, determine that said one storage server may have failed, and if so, determine that said one storage server is active and determine from said records if any of said communications was successful to each of said third ports of said one storage server, and if not, determine that said each third port of said one storage server may have failed, and if so, determine that said each third port of said one storage server is active; and
whereineach of said records also indicates one of said first ports and one of said switches for said each communication attempt, and further comprising; third program instructions to determine from said records if any of said communications was successful to each of said switches leading to said each third port of said one storage server, and if not, determine that said each switch may have failed, and if so, determine that said each switch is active and determine from said records if any of said communications was successful to each of said first ports connected to said each switch leading to said one storage server, and if not, determine that said each first port connected to said each switch leading to said one storage server or a connection between one of said application servers and said each first port connected to said each switch leading to said one storage server may have failed, and if so, determine that said each first port connected to said each switch leading to said one storage server is active; and
whereinsaid first, second and third program instructions are recorded on said medium. - View Dependent Claims (5, 6)
-
Specification