Method for providing a fault tolerant network using distributed server processes to remap clustered network resources to other servers during server failure
First Claim
1. A method for fault tolerant access to a network resource, on a network with a client workstation and a first and a second server, said method for fault tolerant access comprising the acts of:
- selecting a first server to provide communications between a client workstation and a network resource;
detecting a failure of the first server, comprising the acts of;
monitoring across a common bus, at a second server, communications between the first server and the network resource across the common bus by noting a continual change in state of the network resource, and observing a termination in the communications between the first server and the network resource across the common bus by noting a stop in the continual change in state of the network resource; and
routing communications between the client workstation and the network resource via the second server.
5 Assignments
0 Petitions
Accused Products
Abstract
The method of the current invention provides a fault tolerant access to a network resource. A replicated network directory database operates in conjunction with server resident processes to remap a network resource in the event of a server failure. The records/objects in the replicated database contain for each network resource, a primary and a secondary server affiliation. Initially, all users access a network resource through the server identified in the replicated database as being the primary server for the network resource. When server resident processes detect a failure of the primary server, the replicated database is updated to reflect the failure of the primary server, and to change the affiliation of the network resource from its primary to its backup server. This remapping occurs transparently to whichever user/client is accessing the network resource. As a result of the remapping, all users access the network resource through the server identified in the replicated database as the backup server for the resource. When the server resident processes detect a return to service of the primary server, the replicated database is again updated to reflect the resumed operation of the primary server. This remapping of network resource affiliations also occurs transparently to whichever user/client is accessing the network resource, and returns the resource to its original fault tolerant state.
663 Citations
28 Claims
-
1. A method for fault tolerant access to a network resource, on a network with a client workstation and a first and a second server, said method for fault tolerant access comprising the acts of:
-
selecting a first server to provide communications between a client workstation and a network resource;
detecting a failure of the first server, comprising the acts of;
monitoring across a common bus, at a second server, communications between the first server and the network resource across the common bus by noting a continual change in state of the network resource, and observing a termination in the communications between the first server and the network resource across the common bus by noting a stop in the continual change in state of the network resource; and
routing communications between the client workstation and the network resource via the second server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
identifying in a first record, the primary server for the network resource as the first server;
discovering a recovery of the first server; and
re-routing communications between the client workstation and the network resource via the first server.
-
-
3. The method for fault tolerant access to a network resource of claim 2, further comprising
providing a network resource database; - and
replicating the network resource database on the first and the second servers.
- and
-
4. The method for fault tolerant access to a network resource of claim 2, wherein said act of detecting a recovery of the first server, further includes the acts of:
-
sending packets intermittently from the second server to the first server; and
re-acquiring acknowledgments from the first server at the second server, the acknowledgments responsive to said sending act and to the recovery of said first server.
-
-
5. The method for fault tolerant access to a network resource of claim 1, further comprising:
-
choosing the first server as the primary server and the second server as the backup server, for the network resource; and
storing in a first field of the first record the primary server for the network resource and sorting in a second field of the first record the backup server for the network resource.
-
-
6. The method for fault tolerant access to a network resource of claim 5, wherein said choosing act, includes the act of:
allowing a network administrator to select the primary and the backup server.
-
7. The method for fault tolerant access to a network resource of claim 5, wherein said act of detecting a failure of the first server, further includes the acts of:
-
reading the second field in the first record of the network resource database;
determining on the basis of said reading act that the second field identifies the backup server for the network resource as the second server;
activating the monitoring by the second server of the first server, in response to said determining act; and
ascertaining at the second server a failure of the first server.
-
-
8. The method for fault tolerant access to a network resource of claim 6, wherein said act of ascertaining at the second server a failure of the first server, further includes the acts of:
-
sending packets intermittently from the second server to the first server;
receiving acknowledgments from the first server at the second server, the acknowledgments responsive to said sending act; and
noticing a termination in the receipt of acknowledgments from the first server.
-
-
9. The method for fault tolerant access to a network resource of claim 5, wherein said act of recognizing the backup server for the network resource, further includes the acts of:
-
reading the second field in the first record of the network resource database; and
determining that the second field identifies the backup server for the network resource as the second server.
-
-
10. A program storage device encoding instructions for:
-
causing a computer to provide a network resource database, the database including individual records corresponding to network resources, and the network resource database including a first record corresponding to the network resource and the first record identifying a primary server for the network resource as a first server;
causing a computer to select, on the basis of the first record, the first server to provide communications between a client workstation and the network resource;
causing a computer to recognize the backup server for the network resource as the second server;
causing a computer to detect a failure of the first server, including;
causing a computer to monitor across a common bus, at the second server, communications between the first server and the network resource across the common bus by noting a continual change in state of the network resource, and causing a computer to observe a termination in the communications between the first server and the network resource across the common bus by noting a stop in the continual change in state of the network resource; and
causing a computer to route communications between the client workstation and the network resource via the second server, responsive to said recognizing and detecting acts. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
causing a computer to identify in the first record, the primary server for the network resource as the first server;
causing a computer to discover a recovery of the first server; and
causing a computer to re-route communications between the client workstation and the network resource via the first server, responsive to said identifying and discovering acts.
-
-
12. The program storage device of claim 11, further including instructions for:
causing a computer to replicate a network resource database on the fist and the second server.
-
13. The program storage device of claim 11, wherein said instructions for causing a computer to detect a recovery of the first server, further includes instructions for:
-
causing a computer to send packets intermittently from the second server to the first server; and
causing a computer to acquire acknowledgments from the first server at the second server, the acknowledgments responsive to said sending act and to the recovery of the first server.
-
-
14. The program storage device of claim 13, further including instructions for:
-
causing a computer to choose the first server as the primary server and the second server as the backup server, for the network resource; and
causing a computer to store in a first field of the first record the primary server for the network resource and storing in a second field of the first record the backup server for the network resource.
-
-
15. The program storage device of claim 14, wherein said instructions for causing a computer to choose, further include:
causing a computer to allow a network administrator to select the primary and the backup server.
-
16. The program store device of claim 14, wherein said instructions for causing a computer to detect a failure of the first server further include instructions for:
-
causing a computer to read the second field in the first record of the network resource database;
causing a computer to determine on the basis of said reading act that the second field identifies the backup server for the network resource as the second server;
causing a computer to activate the monitoring by the second server of the first server, in response to said determining act; and
causing a computer to ascertain at the second server a failure of the first server.
-
-
17. The program storage device of claim 16, wherein said instructions for causing a computer to ascertain at the second server a failure of the first server, further includes instructions for:
-
causing a computer to send packets intermittently from the second server to the first server;
causing a computer to receive acknowledgments from the first server at the second server, the acknowledgments responsive to said sending act; and
causing a computer to notice a termination in the receipt of acknowledgments from the first server.
-
-
18. The program storage device of claim 14, wherein said instructions for causing a computer to recognize the backup server for the network resource, further include instructions for:
-
causing a computer to read the second field in the first record of the network resource database; and
causing a computer to determine on the basis of said reading act that the second field identifies the backup server for the network resource as the second server.
-
-
19. A method for providing fault tolerant access to a network resource, on a network with a client workstation and a first and a second server and a network resource database, wherein the network resource database includes a first record corresponding to a network resource and the first record includes a first field containing the name of the network resource and a second field containing the host server affiliation of the network resource;
- said method for fault tolerant access comprising the acts of;
expanding the network resource database to include a third field for naming the primary server affiliation for the network resource and a fourth field for naming the backup server affiliation for the network resource;
naming the first server in the third field;
selecting, on the basis of the first record, the first server to provide communications between the client workstation and the network resource;
naming the second server in the fourth field;
recognizing, on the basis of the fourth field of the first record, the backup server for the network resource as the second server;
detecting a failure of the first server, including the acts of;
monitoring across a common bus, at the second server, communications between the first server and the network resource across the common bus by noting a continual change in state of the network resource, and observing a termination in the communications between the first server and the network resource across the common bus by noting a stop in the continual change in state of the network resource; and
routing communications between the client workstation and the network resource via the second server, responsive to said recognizing and detecting acts. - View Dependent Claims (20, 21, 22, 23)
monitoring the server named in the third field;
discovering a recovery of the server named in the third field; and
re-routing communications between the client workstation and the network resource via the first server, responsive to said monitoring and discovering acts.
- said method for fault tolerant access comprising the acts of;
-
21. The method for fault tolerant access to a network resource of claim 20, wherein said act of discovering a recovery of the first server, further includes the acts of:
-
sending packets intermittently from the second server to the first server; and
re-acquiring acknowledgments from the first server at the second server, the acknowledgments responsive to said sending act and to the recovery of said first server.
-
-
22. The method for fault tolerant access to a network resource of claim 19, wherein said naming acts, include the acts of:
allowing a network administrator to name the primary server affiliation and the backup server affiliation in the third and fourth fields of the first record of the network resource database.
-
23. The method for fault tolerant access to a network resource of claim 19, wherein said act of detecting a failure of the first server, further includes the acts of sending
sending packets intermittently from the second server to the first server; -
receiving acknowledgments from the first server at the second server, the acknowledgments responsive to said sending act; and
noticing a termination in the receipt of acknowledgments from the first server.
-
-
24. A computer usable medium having computer readable program code means embodied therein for causing fault tolerant access to a network resource on a network with a client workstation and a first and second server, and a network resource database, wherein the network resource database includes a first record corresponding to a network resource and the first record includes a first field containing the name of the network resource and a second field containing the host server affiliation of the network resource;
- the computer readable program code means in said article of manufacture comprising;
computer readable program code means for causing a computer to expand the network resource database to include a third field for naming the primary server affiliation for the network resource and a fourth field for naming the backup server affiliation for the network resource;
computer readable program code means for causing a computer to name the first server in the third field;
computer readable program code means for causing a computer to select, on the basis of the first record, the first server to provide communications between the client workstation and the network resource;
computer readable program code means for causing a computer to name the second server in the fourth field;
computer readable program code means for causing a computer to recognize, on the basis of the fourth field of the first record, the backup server for the network resource as the second server;
computer readable program code means for monitoring across a common bus, at a second server, communications between the first server and the network resource across the common bus by noting a continual change in state of the network resource, and observing a termination in the communications between the first server and the network resource across the common bus by noting a stop in the continual change in state of the network resource; and
computer readable program code means for causing a computer to route communications between the client workstation and the network resource via the second server, responsive to said recognizing and detecting acts. - View Dependent Claims (25, 26, 27, 28)
computer readable program code means for causing a computer to monitor the server named in the third field;
computer readable program code means for causing a computer to discover a recovery of the server named in the third field; and
computer readable program code means for causing a computer to re-route communications between the client workstation and the network resource via the first server, responsive to said monitoring and discovering acts.
- the computer readable program code means in said article of manufacture comprising;
-
26. The computer readable program code means in said article of manufacture of claim 25, wherein said computer readable program code means for causing a computer to discover a recovery, further includes:
-
computer readable program code means for causing a computer to send packets intermittently from the second server to the first server; and
computer readable program code means for causing a computer to re-acquire acknowledgments from the first server at the second server, the acknowledgments responsive to said sending act and to the recovery of said first server.
-
-
27. The computer readable program code means in said article of manufacture of claim 24, wherein said computer readable program code means for causing a computer to name, further includes:
computer readable program code means for causing a computer to allow a network administrator to name the primary server affiliation and the backup server affiliation in the third and fourth fields of the first record of the network resource database.
-
28. The computer readable program code means in said article of manufacture of claim 24, wherein said computer readable program code means for causing a computer to detect a failure, further includes:
-
computer readable program code means for causing a computer to send packets intermittently from the second server to the first server;
computer readable program code means for causing a computer to receive acknowledgments from the first server at the second server, the acknowledgments responsive to said sending act; and
computer readable program code means for causing a computer to notice a termination in the receipt of acknowledgments from the first server.
-
Specification