Methods and systems for distributed failure detection and recovery using leasing techniques
First Claim
1. A method for recovering from failures in a distributed system that includes a client and a server, the method, performed by the server, comprising:
- receiving a request from the client for a lease that allows the client to access a resource managed by the server for a period of time storing a request object associated with the request in a predetermined memory space;
granting the lease by returning to the client a lease object including methods for managing the lease; and
invoking a recovery method on the request when it is determined that the lease has expired and the client has not sent a request to renew or terminate the lease.
0 Assignments
0 Petitions
Accused Products
Abstract
A system for using a lease to detect a failure and to perform failure recovery is provided in using this system, a client requests a lease from a server to utilize a resource managed by the server for a period of time. Responsive to the request, the server grants the lease, and the client continually requests renewal of the lease. If the client fails to renew the lease, the server detects that an error has occurred to the client. Similarly, if the server fails to respond to a renew request, the client detects that an error has occurred to the server. As part of the lease establishment, the client and server exchange failure-recovery routines that each invokes if the other experiences a failure.
-
Citations
29 Claims
-
1. A method for recovering from failures in a distributed system that includes a client and a server, the method, performed by the server, comprising:
-
receiving a request from the client for a lease that allows the client to access a resource managed by the server for a period of time storing a request object associated with the request in a predetermined memory space;
granting the lease by returning to the client a lease object including methods for managing the lease; and
invoking a recovery method on the request when it is determined that the lease has expired and the client has not sent a request to renew or terminate the lease. - View Dependent Claims (2)
deleting the request object from the predetermined memory space after the recovery method has been invoked.
-
-
3. A system for recovering from failures, comprising:
-
a server configured for granting leases for using resources managed by the server for a period of time and for sending lease objects to a client when a lease is granted; and
a client configured for sending lease requests to the server for using the resources for a period of time, and for receiving the lease objects from the server, wherein the lease objects each include a first recovery method and the lease requests each include a second recovery method, wherein the client is further configured to invoke a the first recovery method when the client determines that a lease renew request sent to the server has not been acknowledged by the server, and the server is further configured to invoke a the second recovery method when the server determines that a lease held by the client has expired and the client has not sent a request to renew or terminate the lease.
-
-
4. A method for recovering from failures in a distributed system that includes a client and a server, comprising:
-
invoking, by the client, a first recovery method on the server when the client determines the server failed to process a lease management request sent by the client; and
invoking, by the server, a second recovery method on the client, when the server determines that the client failed to perform lease management functions before a lease held by the client expired, wherein the lease allows the client to access a resource managed by the server for a period of time. - View Dependent Claims (8, 9)
-
- 5. The method of 4, wherein the lease management functions includes one of sending a request to renew the lease and sending a request to cancel the lease.
- 6. The method of 4, wherein the lease management request includes a request to renew the lease.
-
11. A method for recovering from failures in a distributed system that includes a client and a server, the method performed by the client, comprising:
-
sending a request for a lease to the server for using a managed resource for a period of time;
receiving a lease object from the server, wherein the lease object includes a plurality of methods for managing the requested lease and a recovery method for performing a recovery process associated with the managed resource;
determining that the lease is about to expire;
sending a lease renewal request to the server, in response to the determination that the lease is about to expire;
determining that the lease renewal request did not complete successfully; and
invoking a the recovery method to perform the recovery process.- View Dependent Claims (15)
-
- 12. The method of 11, wherein the lease object includes the recovery method invoked by the client.
-
14. The method of 11, wherein the recovery process includes restarting the server.
-
16. A computer-readable medium including instructions for performing a method, when executed by a processor, for recovering from failures in a distributed system that includes a client and a server, the method, performed by the server, comprising:
-
receiving a request from the client for a lease that allows the client to access a resource managed by the server for a period of time;
storing a request object associated with the request in a predetermined memory space;
granting the lease by returning to the client, a lease object including methods for managing the lease; and
invoking a recovery method on the request when it is determined that the lease has expired and the client has not sent a request to renew or terminate the lease. - View Dependent Claims (17)
deleting the request object from the predetermined memory space after the recovery method has been invoked.
-
-
18. A computer-readable medium, including instructions, for performing a method, when executed by a processor, for recovering from failures in a distributed system that includes a client and a server, comprising:
-
invoking, by the client, a first recovery method on the server when the client determines the server failed to process a lease management request sent by the client; and
invoking, by the server, a second recovery method on the client, when the server determines that the client failed to perform lease management functions before a lease held by the client expired, wherein the lease allows the client to access a resource managed by the server for a period of time. - View Dependent Claims (22, 23)
-
- 19. The computer-readable medium of 18, wherein the lease management functions includes one of sending a request to renew the lease and sending a request to cancel the lease.
- 20. The computer-readable medium of 18, wherein the lease management request includes a request to renew the lease.
-
25. A computer-readable medium, including instructions, for performing a method, when executed by a processor, for recovering from failures in a distributed system that includes a client and a server, the method performed by the client, comprising:
-
sending a request for a lease to the server for using a managed resource for a period of time;
receiving a lease object from the server, wherein the lease object includes a plurality of methods for managing the requested lease and a recovery method for performing a recovery process associated with the managed resource;
determining that the lease is about to expire;
sending a lease renewal request to the server, in response to the determination that the lease is about to expire;
determining that the lease renewal request did not complete successfully; and
invoking a the recovery method to perform the recovery process.
-
- 26. The computer-readable medium of 25, wherein the lease object includes the recovery method invoked by the client.
-
28. The computer-readable medium of 25, wherein the recovery process includes restarting the server.
Specification