System for clustering software applications
First Claim
1. A system for fault tolerant execution of an application program in a server network, comprising:
- a first server for executing the application program;
a cluster network database, coupled to the first server;
an object, stored in the cluster network database, which represents the program and contains information pertaining to the program;
a failure detection module which detects a failure of the first server;
a second server, coupled to the cluster network database; and
a failover module which loads the application program in the second server upon detection of the failure of the first server, in accordance with the information contained in the object.
4 Assignments
0 Petitions
Accused Products
Abstract
A system for fault tolerant execution of an application program in a server network, which includes: a first server for executing the application program; a cluster network database, coupled to the first server; an object, stored in the cluster network database, which represents the program and contains information pertaining to the program; a failure detection module which detects a failure of the first server; a second server, coupled to the cluster network database; and a failover module which loads the application program in the second server upon detection of the failure of the first server. The information contained within the object includes: a host server attribute which identifies which server is currently executing the program; a primary server attribute which identifies which server is primarily responsible for executing the program; and a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure.
387 Citations
110 Claims
-
1. A system for fault tolerant execution of an application program in a server network, comprising:
-
a first server for executing the application program;
a cluster network database, coupled to the first server;
an object, stored in the cluster network database, which represents the program and contains information pertaining to the program;
a failure detection module which detects a failure of the first server;
a second server, coupled to the cluster network database; and
a failover module which loads the application program in the second server upon detection of the failure of the first server, in accordance with the information contained in the object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
a host server attribute which identifies which server is currently executing the program;
a primary server attribute which identifies which server is primarily responsible for executing the program; and
a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure.
-
-
3. The system of claim 2 wherein the information further comprises:
-
an identification field which identifies the program;
a program type field which indicates whether the program is cluster capable or cluster aware; and
a command field which controls a protocol for loading the program and subsequently executing the program.
-
-
4. The system of claim 2 wherein the failover module comprises:
-
a backup status module which reads the backup server attribute in the object with the second server and determines whether the backup server attribute names the second server as the backup server; and
a backup loading module which loads the program in the second server, if the backup server attribute names the second server as the backup server.
-
-
5. The system of claim 4 further comprising a host status module which changes the host server attribute to name the second server as the host server of the program.
-
6. The system of claim 5 further comprising:
-
a phoenix module which detects when the first server is once again operational; and
a failback module which resumes execution of the program in the first server upon detecting that the first server is once again operational.
-
-
7. The system of claim 6 wherein the phoenix module comprises:
-
a monitoring module which transmits packets at periodic intervals from the second server to the first server; and
an acknowledgment module which waits for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is received within the specified period of time, the first server is determined to be operational.
-
-
8. The system of claim 7 wherein the host status module changes the host server attribute to name the first server as the host server of the program after it is determined that the first server is once again operational.
-
9. The system of claim 8 wherein the failback module comprises:
-
a verification module which verifies that the program has been unloaded from the second server; and
a primary loading module which loads the program in a random access memory in the first server after the program has been unloaded from the second server.
-
-
10. The system of claim 9 wherein the verification module comprises a reading module which reads the host server attribute and determines that the host server attribute indicates the first server as the host server of the program.
-
11. The system of claim 1 wherein the failure detection module comprises:
-
a monitoring module which tranmits packets at periodic intervals from the second server to the first server; and
an acknowledgment module which waits for an acknowledgement packet in response to each packet for a specified period of time, wherein if the acknowledgement packet is not received within the specified period of time, the failure of the first server is detected.
-
-
12. The system of claim 1 wherein the failure detection module comprises:
-
a monitoring module which monitors communications between the first server and a network resource; and
a termination module which detects a termination in the communication between the first server and the network resource.
-
-
13. The system of claim 1 wherein the failure detection module comprises:
-
a command module which successively transmits first and second command signals from the first server to a device coupled to the first server, wherein the first command signal places the device in a first status condition and the second command signal places the device in a second status condition; and
a monitoring module which monitors a status condition of the device with the second server, coupled to the device, wherein a change in the status condition of the device indicates that the first server is operational and a constant status condition indicates the failure of the first server.
-
-
14. The system of claim 1 further comprising:
-
a phoenix module which detects when the first server is once again operational; and
a failback module which resumes execution of the program in the first server upon detecting that the first server is once again operational.
-
-
15. The system of claim 14 wherein the phoenix module comprises:
-
a monitoring module which tranmits packets at periodic intervals from the second server to the first server; and
an acknowledgment module which waits for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is received within the specified period of time, the first server is determined to be operational.
-
-
16. The system of claim 14 wherein the failback module comprises:
-
a verification module which verifies that the program has been unloaded from the second server; and
a primary loading module which loads the program in a random access memory in the first server after the program has been unloaded from the second server.
-
-
17. The system of claim 1 further comprising a registration module which automatically stores the object in the cluster network database, wherein the registration module is located within the program.
-
18. The system of claim 17 wherein the information comprises:
-
a host server attribute which identifies which server is currently executing the program;
a primary server attribute which identifies which server is primarily responsible for executing the program; and
a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure.
-
-
19. The system of claim 18 wherein the information further comprises:
-
an identification field which identifies the program;
a program type field which indicates whether the program is cluster capable or cluster aware; and
a command field which controls a protocol for loading the program and subsequently executing the program.
-
-
20. The system of claim 17 wherein the failure detection module comprises:
-
a monitoring module which tranmits packets at periodic intervals from the second server to the first server; and
an acknowledgment module which waits for an acknowledgement packet in response to each packet for a specified period of time, wherein if the acknowledgement packet is not received within the specified period of time, the failure of the first server is detected.
-
-
21. The system of claim 17 wherein the failure detection module comprises:
-
a monitoring module which monitors communications between the first server and a network resource; and
a termination module which detects a termination in the communication between the first server and the network resource.
-
-
22. The system of claim 17 wherein the failure detection module comprises:
-
a command module which successively transmits first and second command signals from the first server to a device coupled to the first server, wherein the first command signal places the device in a first status condition and the second command signal places the device in a second status condition; and
a monitoring module which monitors a status condition of the device with the second server, coupled to the device, wherein a change in the status condition of the device indicates that the first server is operational and a constant status condition indicates the failure of the first server.
-
-
23. The system of claim 17 wherein the failover module comprises:
-
a backup status module which reads the backup server attribute in the object with the second server and determines whether the backup server attribute names the second server as the backup server; and
a backup loading module which loads the program in the second server, if the backup server attribute names the second server as the backup server.
-
-
24. The system of claim 23 further comprising a host status module which changes the host server attribute to name the second server as the host server of the program.
-
25. The system of claim 23 further comprising:
-
a phoenix module which detects when the first server is once again operational; and
a failback module which resumes execution of the program in the first server upon detecting that the first server is once again operational.
-
-
26. The system of claim 25 wherein the phoenix module comprises:
-
a monitoring module which transmits packets at periodic intervals from the second server to the first server; and
an acknowledgment module which waits for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is received within the specified period of time, the first server is determined to be operational.
-
-
27. The system of claim 26 wherein the host status module changes the host server attribute to name the first server as the host server of the program after it is determined that the first server is once again operational.
-
28. The system of claim 27 wherein the failback module comprises:
-
a primary loading module which loads the program in a random access memory in the first server;
a pause module which pauses execution of the program in the first server until it is verified that the program has been unloaded from the second server; and
a verification module which verifies that the program has been unloaded from the second server.
-
-
29. The system of claim 28 wherein the verification module comprises a reading module which reads the host server attribute and determines that the host server attribute indicates the first server as the host server of the program.
-
30. The system of claim 17 further comprising:
-
a phoenix module which detects when the first server is once again operational; and
a failback module which resumes execution of the program in the first server upon detecting that the first server is once again operational.
-
-
31. The system of claim 30 wherein the phoenix module comprises:
-
a monitoring module which tranmits packets at periodic intervals from the second server to the first server; and
an acknowledgment module which waits for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is received within the specified period of time, the first server is determined to be operational.
-
-
32. The system of claim 30 wherein the failback module comprises:
-
a primary loading module which loads the program in a random access memory in the first server;
a pause module which pauses execution of the program in the first server until it is verified that the program has been unloaded from the second server; and
a verification module which verifies that the program has been unloaded from the second server.
-
-
33. The system of claim 32 wherein the primary loading module, the pause module and the verification module are contained within the program.
-
34. The system of claim 32 wherein the verification module comprises a reading module which reads a host server attribute within the object and determines whether the host server attribute indicates the first server is the host server of the program.
-
35. The system of claim 30 further comprising:
-
a first marker-set module which sends a first marker to an application specific file in the database, wherein the first marker identifies a first location within the program where execution of the program by the first server ceased;
a first marker-read module which reads the first marker from the application specific file and directs the second server to commence execution of the program at the first location;
a second marker-set module which sends a second marker to the application specific file in the database, wherein the second marker identifies a second location within the program where execution of the program by the second server ceased; and
a second marker-read module which reads the second marker from the application specific file and directs the first server to commence execution of the program at the second location.
-
-
36. The system of claim 35 wherein:
- the first marker-set module comprises;
a first pointer module which updates a pointer within the program as it is executed by the first server; and
the second marker-set module comprises;
a second pointer module which updates a pointer within the program as it is executed by the second server.
- the first marker-set module comprises;
-
37. The system of claim 17 further comprising a resource module which determines if the second server has access to specified resources necessary to execute the program.
-
38. The system of claim 37 wherein the specified resources are identified in a list of resources which is part of the information contained within the object.
-
39. The system of claim 17 further comprising:
-
a first marker-set module which sends a first marker to an application specific file in the database, wherein the first marker identifies a first location within the program where execution of the program by the first server ceased; and
a first marker-read module which reads the first marker from the application specific file and directs the second server to commence execution of the program at the first location.
-
-
40. The system of claim 39 wherein the first marker-set module comprises a first pointer module which updates a pointer within the program as it is executed by the first server.
-
41. A system for fault tolerant execution of an application program in a server network, comprising:
-
a first server for executing the application program;
a cluster network database for storing objects therein;
a cluster interface for prompting a system operator for information to be stored in the objects, wherein the information comprises;
a host server attribute which identifies which server is currently executing the program;
a primary server attribute which identifies which server is primarily responsible for executing the program; and
a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure;
a second server, coupled to the database, for executing the program if the first server fails;
a failure module which detects if the first server has failed;
a failover module which executes the program in the second server if it is determined that the first server has failed, the failover module comprising;
a backup status module which reads the backup server attribute in the object and determines whether the backup server attribute names the second server as the backup server;
a backup loading module which loads the program in the second server if the backup server attribute names the second server as the backup server;
a phoenix module which determines if the first server is once again operational; and
a failback module which resumes execution of the program in the first server if it is determined that the first server is once again operational, the failback module comprising;
a backup unload module which unloads the program from a random access memory in the second server;
a verification module which verifies that the program has been unloaded from the second server; and
a primary load module which loads the program in a random access memory in the first server after the program has been unloaded from the second server.
-
-
42. A system for fault tolerant execution of an application program in a server network, comprising:
-
a first server for executing the application program;
a cluster network database for storing an object representing the program;
a registration module which automatically stores the object in the database, wherein the object contains information comprising;
a host server attribute which identifies which server is currently executing the program;
a primary server attribute which identifies which server is primarily responsible for executing the program; and
a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure;
a second server for executing the program if the first server fails;
a failure detection module which determines if the first server has failed;
a failover module which loads the program in the second server if it is determined that the first server has failed, the failover module comprising;
a reading module which reads the backup server attribute in the object with the second server and determines whether the backup server attribute names the second server as the backup server;
a backup load module which loads the program in the second server if the backup server attribute names the second server as the backup server;
a phoenix module which determines if the first server is once again operational; and
a failback module which loads the program in the first server if it is determined that the first server is once again operational, the failback module comprising;
a backup unload module which unloads the program from a random access memory in the second server;
a primary load module which loads the program in a random access memory in the first server;
a pause module which pauses execution of the program in the first server until it is verified that the program has been unloaded from the second server; and
a verification module which verifies that the program has been unloaded from the second server. - View Dependent Claims (43, 44, 45, 46)
a first marker-set module which sends a first marker to an application specific file in the database, wherein the first marker identifies a first location within the program where execution of the program by the first server ceased;
a first marker-read module which reads the first marker from the application specific file and directs the second server to commence execution of the program at the first location;
a second marker-set module which sends a second marker to the application specific file in the database, wherein the second marker identifies a second location within the program where execution of the program by the second server ceased; and
a second marker-read module which reads the second marker from the application specific file and directs the first server to commence execution of the program at the second location.
-
-
44. The system of claim 43 wherein:
- the first marker-set module comprises;
a first pointer module which updates a pointer within the program as it is executed by the first server; and
the second marker-set module comprises;
a second pointer module which updates a pointer within the program as it is executed by the second server.
- the first marker-set module comprises;
-
45. The system of claim 42 further comprising a resource module which determines if the second server has access to specified resources necessary to execute the program.
-
46. The system of claim 45 wherein the specified resources are identified in a list of resources which is part of the information contained within the object.
-
47. A system for fault tolerant execution of an application program in a server network, comprising:
-
a first server for executing the application program;
a cluster network database, coupled to the first server;
an object, stored in the cluster network database, which represents the program and contains information pertaining to the program;
a failure detection module which detects a failure of the first server;
a second server, coupled to the cluster network database;
a reading module which reads the information from the object; and
a failover module which loads the application program in the second server upon detection of the failure of the first server, in accordance with the information contained in the object. - View Dependent Claims (48)
a host server attribute which identifies which server is currently executing the program;
a primary server attribute which identifies which server is primarily responsible for executing the program; and
a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure.
-
-
49. A system for fault tolerant execution of an application program in a server network having a first and second server, comprising:
-
means for executing the application program in the first server;
means for storing an object which represents the program in a cluster network database, wherein the object contains information pertaining to the program;
means for detecting a failure of the first server; and
means for executing the application program in the second server upon detection of the failure of the first server, in accordance with the information in the object. - View Dependent Claims (50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93)
means for promting a system operator for the information, wherein the information comprises;
a host server attribute which identifies which server is currently executing the program;
a primary server attribute which identifies which server is primarily responsible for executing the program; and
a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure.
-
-
51. The system of claim 50 wherein the information further comprises:
-
an identification field which identifies the program;
a program type field which indicates whether the program is cluster capable or cluster aware; and
a command field which controls a protocol for loading the program and subsequently executing the program.
-
-
52. The system of claim 50 wherein the means for executing the program in the second server comprises:
-
means for reading the backup server attribute in the object with the second server;
means for determining whether the backup server attribute names the second server as the backup server;
means for loading the program in the second server, if the backup server status names the second server as the backup server.
-
-
53. The system of claim 52 further comprising means for changing the host server attribute to name the second server as the host server of the program.
-
54. The system of claim 53 further comprising:
-
means for detecting when the first server is once again operational; and
means for resuming execution of the program in the first server upon detecting that the first server is once again operational.
-
-
55. The system of claim 54 wherein the means for detecting when the first server is once again operational, comprises:
-
means for tranmitting packets at periodic intervals from the second server to the first server; and
means for waiting for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is received within the specified period of time, the first server is determined to be operational.
-
-
56. The system of claim 55 further comprising means for changing the host server attribute to name the first server as the host server of the program.
-
57. The system of claim 56 wherein the means for resuming execution of the program in the first server comprises:
-
means for unloading the program from a random access memory in the second server;
means for verifying that the program has been unloaded from the second server; and
means for loading the program in a random access memory in the first server after the program has been unloaded from the second server.
-
-
58. The system of claim 57 wherein the means for verifying that the program has been unloaded from the second server comprises means for reading the host server attribute and means for determining that the host server status indicates the first server as the host server of the program.
-
59. The system of claim 49 wherein the means for detecting a failure of the first server comprises:
-
means for tranmitting packets at periodic intervals from the second server to the first server; and
means for waiting for an acknowledgement packet in response to each packet for a specified period of time, wherein if the acknowledgement packet is not received within the specified period of time, the failure of the first server is detected.
-
-
60. The system of claim 49 wherein the means for detecting a failure of the first server comprises:
-
means for monitoring communications between the first server and a network resource; and
means for detecting a termination in the communication between the first server and the network resource.
-
-
61. The system of claim 49 wherein the means for detecting a failure of the first server comprises:
-
means for successively transmitting first and second command signals from the first server to a device coupled to the first server, wherein the first command signal places the device in a first status condition and the second command signal places the device in a second status condition; and
means for monitoring a status condition of the device with the second server, coupled to the device, wherein a change in the status condition of the device indicates that the first server is operational and a constant status condition indicates the failure of the first server.
-
-
62. The system of claim 49 further comprising:
-
means for detecting when the first server is once again operational; and
means for resuming execution of the program in the first server upon detecting that the first server is once again operational.
-
-
63. The system of claim 62 wherein the means for detecting when the first server is once again operational, comprises:
-
means for tranmitting packets at periodic intervals from the second server to the first server; and
means for waiting for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is received within the specified period of time, the first server is determined to be operational.
-
-
64. The system of claim 62 wherein the means for resuming execution of the program in the first server comprises:
-
means for unloading the program from a random access memory in the second server;
means for verifying that the program has been unloaded from the second server; and
means for loading the program in a random access memory in the first server after the program has been unloaded from the second server.
-
-
65. The system of claim 49 wherein the means for storing an object which represents the program in a cluster network database is contained within the program and further comprises means for automatically writing the information to the object, wherein the information is also contained within the program.
-
66. The system of claim 65 wherein the information comprises:
-
a host server attribute which identifies which server is currently executing the program;
a primary server attribute which identifies which server is primarily responsible for executing the program; and
a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure.
-
-
67. The system of claim 66 wherein the information further comprises:
-
an identification field which identifies the program;
a program type field which indicates whether the program is cluster capable or cluster aware; and
a command field which controls a protocol for loading the program and subsequently executing the program.
-
-
68. The system of claim 66 wherein the means for executing the program in the second server comprises:
-
means for reading the backup server attribute in the object with the second server;
means for determining whether the backup server attribute names the second server as the backup server;
means for loading the program in the second server, if the backup server status names the second server as the backup server.
-
-
69. The system of claim 68 further comprising means for changing the host server attribute to name the second server as the host server of the program.
-
70. The system of claim 69 further comprising:
-
means for detecting when the first server is once again operational; and
means for resuming execution of the program in the first server upon detecting that the first server is once again operational.
-
-
71. The system of claim 70 further comprising:
-
means for determining a first location within the program where execution of the program by the first server ceased;
means for commencing execution of the program by the second server at the first location;
means for determining a second location within the program where execution of the program by the second server ceased; and
means for commencing execution of the program by the first server at the second location.
-
-
72. The system of claim 71 wherein:
-
the means for determining the first position comprises;
means for updating a pointer within the program as it is executed by the first server; and
means for determining the location of the pointer prior to execution of the program by the second server; and
the means for determining the second position comprises;
means for updating the pointer within the program as it is executed by the second server; and
means for determining the location of the pointer prior to resuming execution of the program by the first server.
-
-
73. The system of claim 72 further comprising:
-
means for determining if the second server has access to specified resources necessary to execute the program; and
means for sending an error message to a system operator, if it is determined that the second server does not have access to the specified resources.
-
-
74. The system of claim 73 wherein the specified resources are identified in a list of resources which is part of the information contained within the object.
-
75. The system of claim 74 wherein the means for determining if the second server has access to specified resources necessary to execute the program, comprises means for comparing the list of resources to resources identified and initialized by a BIOS program stored within the second server.
-
76. The system of claim 74 wherein the means for determining if the second server has access to specified resources necessary to execute the program, comprises means for comparing the list of resources to a configuration file stored within the second server.
-
77. The system of claim 69 wherein the means for detecting when the first server is once again operational, comprises:
-
means for tranmitting packets at periodic intervals from the second server to the first server; and
means for waiting for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is received within the specified period of time, the first server is determined to be operational.
-
-
78. The system of claim 77 further comprising means for changing the host server attribute to name the first server as the host server of the program.
-
79. The system of claim 78 wherein the means for resuming execution of the program in the first server comprises:
-
means for unloading the program from a random access memory in the second server;
means for loading the program in a random access memory in the first server;
means for pausing execution of the program in the first server until it is verified that the program has been unloaded from the second server; and
means for verifying that the program has been unloaded from the second server.
-
-
80. The system of claim 79 wherein the means for pausing and verifying are contained within the program and automatically executed as a part of an execution of the program.
-
81. The system of claim 80 wherein the means for verifying that the program has been unloaded from the second server comprises means for reading the host server attribute and means for determining that the host server status indicates the first server as the host server of the program.
-
82. The system of claim 65 wherein the means for detecting a failure of the first server comprises:
-
means for tranmitting packets at periodic intervals from the second server to the first server; and
means for waiting for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is not received within the specified period of time, the failure of the first server is detected.
-
-
83. The system of claim 65 wherein the means for detecting a failure of the first server comprises:
-
means for monitoring communications between the first server and a network resource; and
means for detecting a termination in the communication between the first server and the network resource.
-
-
84. The system of claim 65 wherein the means for detecting a failure of the first server comprises:
-
successively transmitting first and second command signals from the first server to a device coupled to the first server, wherein the first command signal places the device in a first status condition and the second command signal places the device in a second status condition; and
means for monitoring a status condition of the device with the second server, coupled to the device, wherein a change in the status condition of the device indicates that the first server is operational and a constant status condition indicates the failure of the first server.
-
-
85. The system of claim 65 further comprising:
-
means for determining a first location within the program where execution of the program by the first server ceased; and
means for commencing execution of the program by the second server at the first location.
-
-
86. The system of claim 85 wherein the means for determining the first position comprises:
-
means for updating a pointer within the program as it is executed by the first server; and
means for determining the location of the pointer prior to execution of the program by the second server.
-
-
87. The system of claim 65 further comprising:
-
means for determining if the second server has access to specified resources necessary to execute the program; and
means for sending an error message to a system operator, if it is determined that the second server does not have access to the specified resources.
-
-
88. The system of claim 87 wherein the specified resources are identified in a list of resources which is part of the information contained within the object.
-
89. The system of claim 88 wherein the means for determining if the second server has access to specified resources necessary to execute the program, comprises means for comparing the list of resources to a list of resources initialized by a BIOS program stored within the second server.
-
90. The system of claim 88 wherein the means for determining if the second server has access to specified resources necessary to execute the program, comprises means for comparing the list of resources to a configuration file stored within the second server.
-
91. The system of claim 65 further comprising:
-
means for detecting when the first server is once again operational; and
means for resuming execution of the program in the first server upon detecting that the first server is once again operational.
-
-
92. The system of claim 91 wherein the means for detecting when the first server is once again operational, comprises:
-
means for tranmitting packets at periodic intervals from the second server to the first server; and
means for waiting for an acknowledgement signal in response to each packet for a specified period of time, wherein if the acknowledgement signal is received within the specified period of time, the first server is determined to be operational.
-
-
93. The system of claim 91 wherein the means for resuming execution of the program in the first server comprises:
-
means for unloading the program from a random access memory in the second server;
means for loading the program in a random access memory in the first server;
means for pausing execution of the program in the first server until it is verified that the program has been unloaded from the second server; and
means for verifying that the program has been unloaded from the second server.
-
-
94. A system for fault tolerant execution of an application program in a serve having a first and second server, comprising:
-
means for executing the application program in the first server;
means for storing an object which represents the program in a cluster network database, wherein the object contains information pertaining to the program;
means for detecting a failure of the first server;
means for reading the information contained in the object; and
means for executing the application program in the second server upon detection of the failure of the first server, in accordance with the information in the object. - View Dependent Claims (95)
a host server attribute which identifies which server is currently executing the program;
a primary server attribute which identifies which server is primarily responsible for executing the program; and
a backup server attribute which identifies which server is a backup server for executing the program if the primary server experiences a failure.
-
-
96. A system for providing fault tolerant execution of an application program in a server network having a first and second server, comprising:
-
means for executing said application program in said first server;
means for detecting a fault in the execution of said application program in said first server; and
means for automatically, without operator intervention, executing said application program in said second server in response to said detecting step. - View Dependent Claims (97, 98, 99)
means for sensing correction of said fault in the execution of said application program in said first server; and
means for automatically, without operator intervention, executing said application program in said first server in response to said sensing step.
-
-
98. The system of claim 97 wherein said sensing is provided by said second server.
-
99. The method of claim 96 wherein said detecting is provided by said second server.
-
100. A system for providing fault tolerant execution of an application program in a server network having a first and second server, comprising:
-
means for executing said application program in said first server;
means for detecting a fault in the first server; and
means for automatically, without operator intervention, executing said application program in said second server in response to said detecting step. - View Dependent Claims (101, 102, 103)
means for sensing correction of said fault in said first server; and
means for automatically, without operator intervention, executing said application program in said first server in response to said sensing step.
-
-
102. The system of claim 101 wherein said sensing is provided by said second server.
-
103. The system of claim 100 wherein said detecting is provided by said second server.
-
104. A system for providing fault tolerant execution of an application program in a server network having a first and second server, comprising:
-
means for executing said application program in said first server;
means for detecting a failure of said first server to properly run said application; and
means for automatically, without operator intervention, executing said application program in said second server in response to said detecting step. - View Dependent Claims (105, 106, 107)
means for sensing correction of said failure of said first server; and
automatically, without operator intervention, executing said application program in said first server in response to said sensing step.
-
-
106. The system of claim 105 wherein said sensing is provided by said second server.
-
107. The method of claim 105 wherein said detecting is provided by said second server.
-
108. A network server system, comprising:
-
a first server and a second server, each configured to execute a first application program;
a first control module for causing said first server to execute said first application program when said first server is capable of executing said first application program; and
a second control module for causing said second server to execute said first application program when said first server is incapable of executing said first application program. - View Dependent Claims (109, 110)
-
Specification