System and method for preserving state for a cluster of data servers in the presence of load-balancing, failover, and fail-back events
First Claim
1. A method for preserving a state for a cluster of data servers with a shared storage, the method comprising:
- interacting with at least one client via a communication channel, to generate state information;
recognizing an event that causes a data server to fail;
establishing at least one of the data servers as a replacement server for the failed data server;
initiating a recovery of the failed data server;
providing the replacement server with an identity of the failed data server and of clients in communication with the failed data server, prior to failure;
providing the replacement server with the state information associated with the clients at the time of the data server failure;
redirecting the clients to the replacement server;
the replacement server preserving the provided state information associated with the redirected clients; and
the replacement server serving the redirected clients;
1 Assignment
0 Petitions
Accused Products
Abstract
A state management system preserves a state for a cluster of file servers in a cluster file system in the presence of load balancing, failover, and fail-back events. The system provides a file and record locking solution for a clustered network attached storage system running on top of a cluster file system. The system employs a lock ownership scheme in which ownership identifiers are guaranteed to be unique across clustered servers and across various protocols the clustered servers may be exporting. The system supports multi-protocol clustered NAS gateways, NAS gateway server failover and fail-back, and load-balancing architectures. The system further eliminates a need for a lock migration protocol, resulting in improved efficiency and simplicity.
148 Citations
20 Claims
-
1. A method for preserving a state for a cluster of data servers with a shared storage, the method comprising:
-
interacting with at least one client via a communication channel, to generate state information;
recognizing an event that causes a data server to fail;
establishing at least one of the data servers as a replacement server for the failed data server;
initiating a recovery of the failed data server;
providing the replacement server with an identity of the failed data server and of clients in communication with the failed data server, prior to failure;
providing the replacement server with the state information associated with the clients at the time of the data server failure;
redirecting the clients to the replacement server;
the replacement server preserving the provided state information associated with the redirected clients; and
the replacement server serving the redirected clients;
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer program product including a plurality of executable instruction codes on a computer readable medium, for preserving a state for a cluster of data servers with a shared storage, the computer program product comprising:
-
a first set of instruction codes for interacting with at least one client via a communication channel, to generate state information;
a second set of instruction codes for recognizing an event that causes a data server to fail;
a third set of instruction codes for establishing at least one of the data servers as a replacement server for the failed data server;
a fourth set of instruction codes for initiating a recovery of the failed data server;
a fifth set of instruction codes for providing the replacement server with an identity of the failed data server and of clients in communication with the failed data server, prior to failure;
a sixth set of instruction codes for providing the replacement server with the state information associated with the clients at the time of the data server failure;
a seventh set of instruction codes for redirecting the clients to the replacement server;
wherein the replacement server preserves the provided state information associated with the redirected clients; and
wherein the replacement server serving the redirected clients;
- View Dependent Claims (16, 17)
-
-
18. A system for preserving a state for a cluster of data servers with a shared storage, the system comprising:
-
a data server for interacting with at least one client via a communication channel, to generate state information;
a clustering module for recognizing an event that causes a data server to fail;
the clustering module establishing at least one of the data servers as a replacement server for the failed data server;
a cluster leader initiating a recovery of the failed data server;
the cluster leader providing the replacement server with an identity of the failed data server and of clients in communication with the failed data server, prior to failure;
the cluster leader further providing the replacement server with the state information associated with the clients at the time of the data server failure;
the cluster leader redirecting the clients to the replacement server;
the replacement server preserving the provided state information associated with the redirected clients; and
the replacement server serving the redirected clients;
- View Dependent Claims (19, 20)
-
Specification