Mechanism for fail-over notification
First Claim
1. A method for providing failure notification in a computer network having a plurality of nodes wherein a predetermined number of the nodes are each processing one part of an application program, which method comprises the steps of:
- for each part o the application program, making a first request for an exclusive access privilege to a preselected resource name;
for each part of the application program, making a second request for a preselected access privilege to a corresponding resource name having a name based upon the respective part;
granting the first request for an exclusive access privilege to one and only one part of the application program;
granting the second request to each part of the application program;
making a third request on behalf of the one and only one part of the application program for each resource name having a name based upon a respective part of the application program other than the one and only one part;
said third request being for an access privilege that is incompatible with each second request;
operating the computer network to;
(i) store information on each third request for a resource having a name based upon a respective part of the application program other than the one and only one part, and(ii) upon a failure of one node of the predetermined number of nodes processing one part of the application program, automatically invalidating the second request of a respective part being processed on the failed node and automatically granting the third request made on behalf of the one and only one part of the application program for the resource name having a name based upon the respective part being processed on the failed node; and
utilizing the grant of the third request for the resource name having a name based upon the respective part being processed on the failed node to cause a message to be generated to identify the respective part being processed on the failed node.
2 Assignments
0 Petitions
Accused Products
Abstract
An automatic failure notification mechanism for use in a computer network wherein several co-operating parts of an application program are each running on a different node of the computer network. The mechanism comprises a set of linked subroutines called by each part of the application program and operating through the use of a distributed lock manager to designate one part as a part to receive failure notification and to link and reverse link the selected part and the other parts of the application program. The failure notification mechanism utilizes the link and reverse link to initiate a failure communication upon the failure of a node executing one part of the application program.
-
Citations
57 Claims
-
1. A method for providing failure notification in a computer network having a plurality of nodes wherein a predetermined number of the nodes are each processing one part of an application program, which method comprises the steps of:
-
for each part o the application program, making a first request for an exclusive access privilege to a preselected resource name; for each part of the application program, making a second request for a preselected access privilege to a corresponding resource name having a name based upon the respective part; granting the first request for an exclusive access privilege to one and only one part of the application program; granting the second request to each part of the application program; making a third request on behalf of the one and only one part of the application program for each resource name having a name based upon a respective part of the application program other than the one and only one part;
said third request being for an access privilege that is incompatible with each second request;operating the computer network to; (i) store information on each third request for a resource having a name based upon a respective part of the application program other than the one and only one part, and (ii) upon a failure of one node of the predetermined number of nodes processing one part of the application program, automatically invalidating the second request of a respective part being processed on the failed node and automatically granting the third request made on behalf of the one and only one part of the application program for the resource name having a name based upon the respective part being processed on the failed node; and utilizing the grant of the third request for the resource name having a name based upon the respective part being processed on the failed node to cause a message to be generated to identify the respective part being processed on the failed node. - View Dependent Claims (2, 3, 4)
-
-
5. In a computer network including a plurality of nodes, wherein at least certain ones of the plurality of nodes have a CPU and wherein a predetermined number of the CPUs are each processing one part of an application program, a failure notification system which comprises:
-
a lock manager to receive and process requests for access privileges to resource names and to provide information on the request in response to a query; and an object library coupled to each one of the CPUs processing one part of the application program, the object library comprising a linked set of subroutines including; i) a SETUP subroutine that can be called by an application part being processed on a CPU coupled to the object library, the SETUP subroutine comprising a sequence of instructions to issue a first request to the lock manager on behalf of the calling application part for an exclusive access privilege to a preselected resource name and to issue a second request to the lock manager on behalf of the calling application part for a preselected access privilege to a resource name based upon the calling application part; ii) a SPECIAL-- AST subroutine that is an asynchronous system trap called upon grant of the first request to the calling application part, the SPECIAL-- AST subroutine comprising a sequence of instructions to query the lock manager for a list of all application parts of the application program that have made a first request through a SETUP subroutine and that have not been granted the first request and to issue a third request to the lock manager on behalf of the calling application part that has been granted the first request for an access privilege to each resource name based upon an application part that has not been granted the first request, each third request being for an access privilege that is not compatible with the predetermined access privilege of a second request; and iii) a FAIL-- AST subroutine that is an asynchronous system trap called upon the grant of any third request to the calling application part that has been granted the first request, the FAIL-- AST subroutine comprising a sequence of instructions to query the lock manager for information on the name of the resource based upon the application part for which the third request has been granted. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. For use in a CPU adapted to be coupled to a computer network, the computer network including a lock manager to receive and process requests for access privileges to resource names and to provide information on the requests in response to a query and wherein the CPU is adapted to process one part of an application program, an object library, which comprises:
-
a linked set of subroutines including; i) a SETUP subroutine that can be called by the application part processed on the CPU, the SETUP subroutine comprising a sequence of instructions to issue a first request to the lock manager on behalf of the calling application part for an exclusive access privilege to a preselected resource name and to issue a second request to the lock manager on behalf of the calling application part for a preselected access privilege to a resource name based upon the calling application part; ii) a SPECIAL-- AST subroutine that is an asynchronous system trap called when the first request has been granted to the calling application part, the SPECIAL-- AST subroutine comprising a sequence of instructions to query the lock manager for a list of all application parts of the application program hat have made a first request through a SETUP subroutine and that have not been granted the first request and to issue a third request to the lock manager on behalf of the calling application part that has been granted the first request for an access privilege to each resource name based upon an application part that has not been granted the first request, each third request being for an access privilege that is not compatible with the predetermined access privilege of a second request; and iii) a FAIL-- AST subroutine that is an asynchronous system trap called upon the grant of any third request to the calling application part when the application part has been granted the first request and after the SPECIAL-- AST has run, the FAIL-- AST subroutine comprising a sequence of instructions to query the lock manager for information on the name of the resource based upon the application part for which the third request has been granted. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
-
-
45. A computer network, comprising:
-
a plurality of nodes, wherein at least certain ones of the plurality of nodes have a CPU and wherein a predetermined number of the CPUs are each processing one part of an application program; a lock manager to receive and process requests for access privileges to resource names and to provide information on the requests in response to a query; and a failure notification system comprising an object library coupled to each one of the CPUs processing one part of the application program, each object library comprising a linked set of subroutines including; i) a SETUP subroutine that can be called by an application part being processed on a respective CPU coupled to the object library, the SETUP subroutine comprising a sequence of instructions to issue a first request to the lock manager on behalf of the calling application part for an exclusive access privilege to a preselected resource name and to issue a second request to the lock manager on behalf of the calling application part for a preselected access privilege to a resource name based upon the calling application part; ii) a SPECIAL-- AST subroutine that is an asynchronous system trap called upon grant of the first request to the calling application part, the SPECIAL-- AST subroutine comprising a sequence of instructions to query the lock manager for a list of all application parts of the application program that have made a first request through a SETUP subroutine and that have not been granted the first request and to issue a third request to the lock manager on behalf of the calling application part that has been granted the first request for an access privilege to each resource name based upon an application part that has not been granted the first request, each third request being for an access privilege that is not compatible with the predetermined access privilege of a second request; and iii) a FAIL-- AST subroutine that is an asynchronous system trap called upon the grant of any third request to the calling application part that has been granted the first request, the FAIL-- AST subroutine comprising a sequence of instructions to query the lock manager for information on the name of the resource based upon the application part for which the third request has been granted.
-
-
46. A method for providing failure notification in a computer network having a plurality of nodes wherein a predetermined number of the nodes are each processing one part of an application program, which method comprises the steps of:
-
providing a lock manager to receive and process requests for access privileges to resource names and to provide information on the requests in response to a query; executing a first routine in respect of each part of the application program to issue a first request to the lock manager on behalf of each application part for an exclusive access privilege to a preselected resource name and to issue a second request to the lock manager on behalf of each application part for a preselected access privilege to a resource name based upon a respective application part; operating the lock manager to grant the first request for an exclusive access privilege to one and only one of the application parts and to grant the second request to each respective application part; executing a second routine in respect of the application part granted the first request, upon the grant of the first request, to query the lock manager for a list of all application parts of the application program that have made a first request through a first routine and that have not been granted the first request and to issue a third request to the lock manager on behalf of the application part that has been ranted the first request for an access privilege to each resource name based upon an application part that has not been granted the first request, each third request being for an access privilege that is not compatible with the predetermined access privilege of a second request; operating the lock manager to store information on each third request that is not compatible with a granted second request of each respective application part that has not been granted the first request; operating the lock manager to invalidate a second request when the node processing the respective application part fails and to automatically grant the third request made by the second routine in respect of the resource name based upon that respective application part; and upon the grant of a third request by the lock manager, executing a third routine in respect of the application part granted the first request to query the lock manager for information on the name of the resource based upon the respective application part for which the third request has been granted. - View Dependent Claims (47, 48, 49, 50, 51, 52, 53)
-
-
54. In a computer network including a plurality of nodes, wherein a predetermined number of the nodes are each processing one part of an application program, a communication system for sending and receiving messages to and from the parts of the application program, which communication system comprises:
-
a lock manager to receive and process requests for access privileges to resource names; and an object library coupled to each node processing one part of the application program and comprising a set of subroutines including; i) a SETUP subroutine called by an application part and comprising a sequence of instructions to issue a first request to the lock manager on behalf of the respective calling application part for a preselected access privilege to a BROADCAST resource, the preselected access privilege being compatible with multiple first requests made on behalf of other application parts; i) a CLUSTER-- BROADCAST subroutine called by another routine and comprising a sequence of instructions to issue a second request to the lock manager on behalf of the calling routine for an exclusive access privilege to the BROADCAST resource and, upon grant of the second request, to write a message on behalf of the calling routine to the lock manager for storage; and ii) an MSG-- AST subroutine that is a asynchronous system trap called upon the making of a second request on behalf of a calling routine, the second request for an exclusive access privilege being incompatible with the preselected access privilege of the first request, the MSG-- AST subroutine comprising a sequence of instructions to issue a third request to the lock manager on behalf of a respective application part, the third request being for a conversion of the preselected access privilege to the BROADCAST resource to a NULL access privilege so that the second request of the calling routine is granted and thereafter, to read the message stored by the lock manager on behalf of the calling routine. - View Dependent Claims (55, 56, 57)
-
Specification