System, method and program to identify failed components in storage area network
First Claim
1. A method for identifying a failed component in a system comprising application servers, storage servers and a switch fabric, said switch fabric having first ports coupled to said application servers to receive requests to access storage managed by said storage servers, second ports coupled to third ports of said storage servers to forward the access requests to said storage servers and receive responses from said storage servers, and internal switches to interconnect said first ports to second ports to pass said requests and responses through said switch fabric, said method comprising:
- each of said application servers compiling records of its own attempts to communicate with said storage servers via said switch fabric, each of said records indicating one of said third ports and one of said storage servers for each of said communication attempts; and
determining from said records if any of said communications was successful to one of said storage servers, and if not, determining that said one storage server may have failed, or if so, determining that said one storage server is active and determining from said records if any of said communications was successful to each of said third ports of said one storage server, and if not, determining that said each third port of said one storage server may have failed, or if so, determining that said each third port of said one storage server is active.
1 Assignment
0 Petitions
Accused Products
Abstract
System, method and computer program product for identifying a failed component in a system comprising application servers, storage servers and a switch fabric. The switch fabric has first ports coupled to the application servers, second ports coupled to third ports of the storage servers and internal switches to interconnect the first ports to second ports. Each of the application servers compiles records of its own attempts to communicate with the storage servers via the switch fabric. Each of the records indicates one of the third ports and one of the storage servers for each of the communication attempts. From the records a determination is made if any of the communications was successful to one of the storage servers. If not, a determination is made that the one storage server may have failed. If any of the communications was successful to the one storage server, a determination is made that the one storage server is active and a determination is made from the records if any of the communications was successful to each of the third ports of the one storage server. If not, a determination is made that each of the third ports of the one storage server may have failed. If any of the communications was successful to each of the third ports, a determination is made that each of the third ports of the one storage server is active. According to a feature of the present invention, a determination is also made based on the records if a switch or first port has failed.
-
Citations
18 Claims
-
1. A method for identifying a failed component in a system comprising application servers, storage servers and a switch fabric, said switch fabric having first ports coupled to said application servers to receive requests to access storage managed by said storage servers, second ports coupled to third ports of said storage servers to forward the access requests to said storage servers and receive responses from said storage servers, and internal switches to interconnect said first ports to second ports to pass said requests and responses through said switch fabric, said method comprising:
-
each of said application servers compiling records of its own attempts to communicate with said storage servers via said switch fabric, each of said records indicating one of said third ports and one of said storage servers for each of said communication attempts; and
determining from said records if any of said communications was successful to one of said storage servers, and if not, determining that said one storage server may have failed, or if so, determining that said one storage server is active and determining from said records if any of said communications was successful to each of said third ports of said one storage server, and if not, determining that said each third port of said one storage server may have failed, or if so, determining that said each third port of said one storage server is active. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. Apparatus for identifying a failed component in a system comprising application servers, storage servers and a switch fabric, said switch fabric having first ports coupled to said application servers to receive requests to access storage managed by said storage servers, second ports coupled to third ports of said storage servers to forward the access requests to said storage servers and receive responses from said storage servers, and internal switches to interconnect said first ports to second ports to pass said requests and responses through said switch fabric, said apparatus comprising:
-
means for receiving from each of said application servers records of its own attempts to communicate with said storage servers via said switch fabric, each of said records indicating one of said third ports and one of said storage servers for each of said communication attempts; and
means for determining from said records if any of said communications was successful to one of said storage servers, and if not, determining that said one storage server may have failed, or if so, determining that said one storage server is active and determining from said records if any of said communications was successful to each of said third ports of said one storage server, and if not, determining that said each third port of said one storage server may have failed, or if so, determining that said each third port of said one storage server is active. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product for identifying a failed component in a system comprising application servers, storage servers and a switch fabric, said switch fabric having first ports coupled to said application servers to receive requests to access storage managed by said storage servers, second ports coupled to third ports of said storage servers to forward the access requests to said storage servers and receive responses from said storage servers, and internal switches to interconnect said first ports to second ports to pass said requests and responses through said switch fabric, said computer program product comprising:
-
a computer readable medium;
first program instructions to receive from each of said application servers records of its own attempts to communicate with said storage servers via said switch fabric, each of said records indicating one of said third ports and one of said storage servers for each of said communication attempts; and
second program instructions to determine from said records if any of said communications was successful to one of said storage servers, and if not, determine that said one storage server may have failed, or if so, determine that said one storage server is active and determine from said records if any of said communications was successful to each of said third ports of said one storage server, and if not, determine that said each third port of said one storage server may have failed, or if so, determine that said each third port of said one storage server is active; and
whereinsaid first and second program instructions are recorded on said medium. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification