Managing remote procedure calls when a server is unavailable
First Claim
Patent Images
1. A computer-implemented method for managing a server cluster, said method comprising:
- issuing a remote procedure call (RPC) request, via at least one computer processor in a client device, from the client device to a first server for processing said RPC request, wherein said first server is located in said server cluster apart from said client device and wherein said server cluster comprises metadata servers that are coupled to a plurality of data servers;
adding, via said at least one computer processor, an entry for said RPC request to an RPC table on said client device;
receiving, at said client device, a message that said first server is inoperative, wherein said message is received from a central management server responsible for monitoring server health in said server cluster and wherein said message is sent in response to an alert to the central management server, wherein said alert indicates said first server has not responded to a health check message issued by said client device, said health check message having a timeout period that is shorter than a timeout period for said RPC request; and
in response to receiving said message, said client device canceling said RPC request, by clearing said RPC request from said RPC table, and then said client device reissuing, via said at least one computer processor, said RPC request to a second server for processing said RPC request, wherein said second server is located in said server cluster apart from said client device.
6 Assignments
0 Petitions
Accused Products
Abstract
A server node can monitor the status of servers in a server cluster. The node may receive an alert indicating that a server in the server cluster is unavailable. In response to the alert, the node can send instructions that cause pending remote procedure call requests to be canceled and then reissued to another server in the server cluster instead of to the first server.
40 Citations
14 Claims
-
1. A computer-implemented method for managing a server cluster, said method comprising:
-
issuing a remote procedure call (RPC) request, via at least one computer processor in a client device, from the client device to a first server for processing said RPC request, wherein said first server is located in said server cluster apart from said client device and wherein said server cluster comprises metadata servers that are coupled to a plurality of data servers; adding, via said at least one computer processor, an entry for said RPC request to an RPC table on said client device; receiving, at said client device, a message that said first server is inoperative, wherein said message is received from a central management server responsible for monitoring server health in said server cluster and wherein said message is sent in response to an alert to the central management server, wherein said alert indicates said first server has not responded to a health check message issued by said client device, said health check message having a timeout period that is shorter than a timeout period for said RPC request; and in response to receiving said message, said client device canceling said RPC request, by clearing said RPC request from said RPC table, and then said client device reissuing, via said at least one computer processor, said RPC request to a second server for processing said RPC request, wherein said second server is located in said server cluster apart from said client device. - View Dependent Claims (2, 3, 4)
-
-
5. A non-transitory computer-readable storage device having computer-executable instructions for causing a computer system to perform a method of managing remote procedure calls, said method comprising:
-
sending, via at least one computer processor of the computer system, a remote procedure call (RPC) request having a first timeout period associated therewith to a first server for processing said RPC request, wherein said first server is located in a server cluster apart from said computer system and wherein said server cluster comprises metadata servers that are coupled to a plurality of data servers; adding an entry for said RPC request to an RPC table stored on said computer system; receiving, at said computer system, a message that said first server is inoperative based upon a health check message to said first server, said health check message having a second timeout period associated therewith that is shorter than said first timeout period, wherein said message is received from a central management server responsible for monitoring server health in said server cluster and wherein said message is sent in response to an alert to the central management server, wherein said alert indicates said first server has not responded to said health check message issued by said computer system; in response to receiving said message, canceling said RPC request to said first server by clearing said RPC request from said RPC table before said first timeout period expires if a response to said health check message is not received within said second timeout period; and after said canceling, resending said RPC request from said computer system to a second server for processing said RPC request, wherein said second server is located in said server cluster apart from said computer system. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method for managing a server cluster, said method comprising:
-
issuing, via at least one computer processor of a client device, a remote procedure call (RPC) request from the client device to a first server for processing said RPC request, wherein said first server is located in said server cluster apart from said client device and wherein said server cluster comprises metadata servers that are coupled to a plurality of data servers; adding, via said at least one computer processor, an entry for said RPC request to an RPC table on said client device; said client device determining, via said at least one computer processor, that said first server is inoperable based upon said client device receiving a message that said first server is inoperative, wherein said message is received from a central management server responsible for monitoring server health in said server cluster and wherein said message is sent in response to an alert to the central management server, wherein said alert indicates said first server has not responded to a health check message issued by said client device, said health check message having a timeout period that is shorter than a timeout period for said RPC request; and in response to said determining, said client device canceling said RPC request by clearing said RPC request from said RPC table, and then said client device reissuing said RPC request to a second server for processing said RPC request instead of to said first server, wherein said second server is located in said server cluster apart from said client device. - View Dependent Claims (13, 14)
-
Specification