Managing remote procedure calls when a server is unavailable

US 9,141,449 B2
Filed: 10/30/2009
Issued: 09/22/2015
Est. Priority Date: 10/30/2009
Status: Expired due to Fees

First Claim

Patent Images

1. A computer-implemented method for managing a server cluster, said method comprising:

issuing a remote procedure call (RPC) request, via at least one computer processor in a client device, from the client device to a first server for processing said RPC request, wherein said first server is located in said server cluster apart from said client device and wherein said server cluster comprises metadata servers that are coupled to a plurality of data servers;

adding, via said at least one computer processor, an entry for said RPC request to an RPC table on said client device;

receiving, at said client device, a message that said first server is inoperative, wherein said message is received from a central management server responsible for monitoring server health in said server cluster and wherein said message is sent in response to an alert to the central management server, wherein said alert indicates said first server has not responded to a health check message issued by said client device, said health check message having a timeout period that is shorter than a timeout period for said RPC request; and

in response to receiving said message, said client device canceling said RPC request, by clearing said RPC request from said RPC table, and then said client device reissuing, via said at least one computer processor, said RPC request to a second server for processing said RPC request, wherein said second server is located in said server cluster apart from said client device.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A server node can monitor the status of servers in a server cluster. The node may receive an alert indicating that a server in the server cluster is unavailable. In response to the alert, the node can send instructions that cause pending remote procedure call requests to be canceled and then reissued to another server in the server cluster instead of to the first server.

40 Citations

14 Claims

1. A computer-implemented method for managing a server cluster, said method comprising:
- issuing a remote procedure call (RPC) request, via at least one computer processor in a client device, from the client device to a first server for processing said RPC request, wherein said first server is located in said server cluster apart from said client device and wherein said server cluster comprises metadata servers that are coupled to a plurality of data servers;
  
  adding, via said at least one computer processor, an entry for said RPC request to an RPC table on said client device;
  
  receiving, at said client device, a message that said first server is inoperative, wherein said message is received from a central management server responsible for monitoring server health in said server cluster and wherein said message is sent in response to an alert to the central management server, wherein said alert indicates said first server has not responded to a health check message issued by said client device, said health check message having a timeout period that is shorter than a timeout period for said RPC request; and
  
  in response to receiving said message, said client device canceling said RPC request, by clearing said RPC request from said RPC table, and then said client device reissuing, via said at least one computer processor, said RPC request to a second server for processing said RPC request, wherein said second server is located in said server cluster apart from said client device.
- View Dependent Claims (2, 3, 4)
- - 2. The computer-implemented method of claim 1 further comprising:
    - monitoring status of servers in said server cluster; and
      
      selecting said second server based on said monitoring.
  - 3. The computer-implemented method of claim 1 wherein said alert indicates said first server has stopped sending heartbeat messages to said client device.
  - 4. The computer-implemented method of claim 1 wherein said metadata servers are coupled to the plurality of data servers via a local area network.

5. A non-transitory computer-readable storage device having computer-executable instructions for causing a computer system to perform a method of managing remote procedure calls, said method comprising:
- sending, via at least one computer processor of the computer system, a remote procedure call (RPC) request having a first timeout period associated therewith to a first server for processing said RPC request, wherein said first server is located in a server cluster apart from said computer system and wherein said server cluster comprises metadata servers that are coupled to a plurality of data servers;
  
  adding an entry for said RPC request to an RPC table stored on said computer system;
  
  receiving, at said computer system, a message that said first server is inoperative based upon a health check message to said first server, said health check message having a second timeout period associated therewith that is shorter than said first timeout period, wherein said message is received from a central management server responsible for monitoring server health in said server cluster and wherein said message is sent in response to an alert to the central management server, wherein said alert indicates said first server has not responded to said health check message issued by said computer system;
  
  in response to receiving said message, canceling said RPC request to said first server by clearing said RPC request from said RPC table before said first timeout period expires if a response to said health check message is not received within said second timeout period; and
  
  after said canceling, resending said RPC request from said computer system to a second server for processing said RPC request, wherein said second server is located in said server cluster apart from said computer system.
- View Dependent Claims (6, 7, 8, 9, 10, 11)
- - 6. The non-transitory computer-readable storage device of claim 5 wherein said method further comprises notifying the central management server if said response to said health check message is not received within said second timeout period.
  - 7. The non-transitory computer-readable storage device of claim 6 wherein said method further comprises receiving address information for a server in response to said notifying.
  - 8. The non-transitory computer-readable storage device of claim 5 wherein said message further comprises address information.
  - 9. The non-transitory computer-readable storage device of claim 5 wherein said method further comprises canceling all pending RPC requests to said first server if said response to said health check message is not received within said second timeout period.
  - 10. The non-transitory computer-readable storage device medium of claim 5 wherein said method further comprises:
    - accessing a list of a plurality of servers; and
      
      selecting said second server from said list.
  - 11. The non-transitory computer-readable storage device of claim 5 wherein said metadata servers are coupled to the plurality of data servers via a local area network.

12. A computer-implemented method for managing a server cluster, said method comprising:
- issuing, via at least one computer processor of a client device, a remote procedure call (RPC) request from the client device to a first server for processing said RPC request, wherein said first server is located in said server cluster apart from said client device and wherein said server cluster comprises metadata servers that are coupled to a plurality of data servers;
  
  adding, via said at least one computer processor, an entry for said RPC request to an RPC table on said client device;
  
  said client device determining, via said at least one computer processor, that said first server is inoperable based upon said client device receiving a message that said first server is inoperative, wherein said message is received from a central management server responsible for monitoring server health in said server cluster and wherein said message is sent in response to an alert to the central management server, wherein said alert indicates said first server has not responded to a health check message issued by said client device, said health check message having a timeout period that is shorter than a timeout period for said RPC request; and
  
  in response to said determining, said client device canceling said RPC request by clearing said RPC request from said RPC table, and then said client device reissuing said RPC request to a second server for processing said RPC request instead of to said first server, wherein said second server is located in said server cluster apart from said client device.
- View Dependent Claims (13, 14)
- - 13. The computer-implemented method of claim 12 wherein said determining comprises determining that said first server has stopped sending heartbeat messages, said heartbeat messages having a detection period that is shorter than the timeout period for said RPC request.
  - 14. The computer-implemented method of claim 12 wherein said metadata servers are coupled to the plurality of data servers via a local area network.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Veritas Technologies, LLC (Whitehouse Group Ltd.)
Original Assignee
Symantec Corporation (NortonLifeLock Inc.)
Inventors
Shyam, Nagaraj, Harmer, Craig, Beck, Ken
Primary Examiner(s)
Kraft, Shih-Wei

Application Number

US12/610,049
Publication Number

US 20110107358A1
Time in Patent Office

2,153 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06F 2209/503   Resource availability

G06F 9/5027   the resource being a machin...

G06F 9/547   Remote procedure calls [RPC...

Managing remote procedure calls when a server is unavailable

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

40 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Managing remote procedure calls when a server is unavailable

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

40 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links