Method and apparatus for the load balancing of non-identical servers in a network environment
First Claim
Patent Images
1. A method in a distributed data processing system for handling requests, the method comprising the computer implemented steps of:
- monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are fixed; and
forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the step of determining an estimated work load comprises dividing a total of service times for a server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus in a distributed data processing system for handling requests. Processing of requests received at a server system is monitored, wherein the server system includes a plurality of servers. A work load is estimated at each of the plurality of servers. The request is forwarded to a server within the plurality of servers having an estimated smallest work load.
141 Citations
22 Claims
-
1. A method in a distributed data processing system for handling requests, the method comprising the computer implemented steps of:
-
monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are fixed; and
forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the step of determining an estimated work load comprises dividing a total of service times for a server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server. - View Dependent Claims (2, 3, 4, 5, 6, 7)
forwarding means, responsive to a subset of servers within the plurality of servers having a same estimated work load, for forwarding the request to a server within the subset having a smallest number of active requests.
-
-
8. A method in a distributed data processing system for handling requests, the method comprising the computer implemented steps of:
-
monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates are variable; and
forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the step of determining an estimated work load includes using an equation;
wherein N is a number of requests completed by the server, A is an actual amount of work completed by the server, I is a total of observed inter-arrival times for the server, in which an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server, S is a total observed service rate for the server.
-
-
9. A server system comprising:
-
a data transfer mechanism;
a plurality of servers coupled to the data transfer system; and
a network dispatcher, wherein the network dispatcher monitors processing of each request received at the server system, calculates an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, and forwards the request to a server within the plurality of servers having a lowest estimated amount of work to process, wherein the plurality of servers have different service rates, wherein the different service rates are fixed, and wherein the service processor calculates an estimated work load for a server within the plurality of servers by dividing a total of service times for the server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server. - View Dependent Claims (10, 11)
-
-
12. A server system comprising:
-
a data transfer mechanism;
a plurality of servers coupled to the data transfer system; and
a network dispatcher, wherein the network dispatcher monitors processing of each request received at the server system, calculates an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, and forwards the request to a server within the plurality of servers having a lowest estimated amount of work to process, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are variable, and wherein the service processor calculates an estimated work load for a server within the plurality of servers follows;
wherein N is a number of requests completed by the server, A is an actual amount of work completed by the server, I is a total of observed inter-arrival times for the server, in which an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server, S is a total observed service rate for the server.
-
-
13. A distributed data processing system for handling requests, the distributed data processing system comprising:
-
monitoring means for monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
estimating means for determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers; and
forwarding means for forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are fixed, and wherein the estimating means comprises;
dividing means for dividing a total of service times for a server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server. - View Dependent Claims (14, 15, 16, 17, 18, 19)
forwarding means, responsive to a subset of servers within the plurality of servers having a same estimated work load, for forwarding the request to a server within the subset having a smallest number of active requests.
-
-
20. A distributed data processing system for handling requests, the distributed data processing system comprising:
-
monitoring means for monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
estimating means for determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers; and
forwarding means for forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the plurality of servers have different service rates, wherein the different service rates are variable, and wherein the estimating means includes using an equation;
wherein N is a number of requests completed by the server, A is an actual amount of work completed by the server, I is a total of observed inter-arrival times for the server, in which an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server, S is a total observed service rate for the server.
-
-
21. A computer program product in a computer readable medium for handling requests, the computer program product comprising:
-
first instructions for monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
second instructions for determining an estimated work load at each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are fixed; and
third instructions for forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the instructions for determining an estimated work load comprises instructions for dividing a total of service times for a server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server.
-
-
22. A computer program product in a computer readable medium for handling requests, the computer program product comprising:
-
first instructions for monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
second instructions for determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates arc variable; and
third instructions for forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the instructions for determining an estimated work load includes instructions for using an equation;
wherein N is a number of requests completed by the server, A is an actual amount of work completed by the server, I is a total of observed inter-arrival times for the server, in which an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server, S is a total observed service rate for the server.
-
Specification