Method and apparatus for the load balancing of non-identical servers in a network environment

US 6,748,414 B1
Filed: 11/15/1999
Issued: 06/08/2004
Est. Priority Date: 11/15/1999
Status: Expired due to Term

First Claim

Patent Images

1. A method in a distributed data processing system for handling requests, the method comprising the computer implemented steps of:

monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;

determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are fixed; and

forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the step of determining an estimated work load comprises dividing a total of service times for a server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus in a distributed data processing system for handling requests. Processing of requests received at a server system is monitored, wherein the server system includes a plurality of servers. A work load is estimated at each of the plurality of servers. The request is forwarded to a server within the plurality of servers having an estimated smallest work load.

141 Citations

22 Claims

1. A method in a distributed data processing system for handling requests, the method comprising the computer implemented steps of:
- monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
  
  determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are fixed; and
  
  forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the step of determining an estimated work load comprises dividing a total of service times for a server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the step of monitoring processing of requests at the server system includes monitoring arrival time of a request and associating the arrival time with a server within the plurality of servers assigned to process the request.
  - 3. The method of claim 1, wherein the server system includes a network dispatcher and wherein the steps of monitoring, determining, and forwarding are located in the network dispatcher.
  - 4. The method of claim 1, wherein the plurality of servers are a plurality of web servers.
  - 5. The method of claim 1, wherein the distributed data processing system is an Internet.
  - 6. The method of claim 1, wherein the distributed data processing system is an intranet.
  - 7. The method of claim 1, further comprising:

8. A method in a distributed data processing system for handling requests, the method comprising the computer implemented steps of:
- monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
  
  determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates are variable; and
  
  forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the step of determining an estimated work load includes using an equation;
  
  $\frac{(N) \cdot (A)}{(I) \cdot (S)}$ wherein N is a number of requests completed by the server, A is an actual amount of work completed by the server, I is a total of observed inter-arrival times for the server, in which an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server, S is a total observed service rate for the server.

9. A server system comprising:
- a data transfer mechanism;
  
  a plurality of servers coupled to the data transfer system; and
  
  a network dispatcher, wherein the network dispatcher monitors processing of each request received at the server system, calculates an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, and forwards the request to a server within the plurality of servers having a lowest estimated amount of work to process, wherein the plurality of servers have different service rates, wherein the different service rates are fixed, and wherein the service processor calculates an estimated work load for a server within the plurality of servers by dividing a total of service times for the server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server.
- View Dependent Claims (10, 11)
- - 10. The server system of claim 9, wherein the network dispatcher monitors arrival time of a request and associating the arrival time with a server within the plurality of servers assigned to process the request.
  - 11. The server system of claim 9, wherein the plurality of servers is a plurality of web servers.

12. A server system comprising:
- a data transfer mechanism;
  
  a plurality of servers coupled to the data transfer system; and
  
  a network dispatcher, wherein the network dispatcher monitors processing of each request received at the server system, calculates an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, and forwards the request to a server within the plurality of servers having a lowest estimated amount of work to process, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are variable, and wherein the service processor calculates an estimated work load for a server within the plurality of servers follows;
  
  $\frac{(N) \cdot (A)}{(I) \cdot (S)}$ wherein N is a number of requests completed by the server, A is an actual amount of work completed by the server, I is a total of observed inter-arrival times for the server, in which an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server, S is a total observed service rate for the server.

13. A distributed data processing system for handling requests, the distributed data processing system comprising:
- monitoring means for monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
  
  estimating means for determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers; and
  
  forwarding means for forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are fixed, and wherein the estimating means comprises;
  
  dividing means for dividing a total of service times for a server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. The distributed data processing system of claim 13, wherein the monitoring means for monitoring processing of requests at the server system includes monitoring arrival time of a request and associating the arrival time with a server within the plurality of servers assigned to process the request.
  - 15. The distributed data processing system of claim 13, wherein the server system includes a network dispatcher and wherein the monitoring means, estimating means, and forwarding means are located in the network dispatcher.
  - 16. The distributed data processing system of claim 13, wherein the plurality of servers are a plurality of web servers.
  - 17. The distributed data processing system of claim 13, wherein the distributed data processing system is an Internet.
  - 18. The distributed data processing system of claim 13, wherein the distributed data processing system is an intranet.
  - 19. The distributed data processing system of claim 13, further comprising:

20. A distributed data processing system for handling requests, the distributed data processing system comprising:
- monitoring means for monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
  
  estimating means for determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers; and
  
  forwarding means for forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the plurality of servers have different service rates, wherein the different service rates are variable, and wherein the estimating means includes using an equation;
  
  $\frac{(N) \cdot (A)}{(I) \cdot (S)}$ wherein N is a number of requests completed by the server, A is an actual amount of work completed by the server, I is a total of observed inter-arrival times for the server, in which an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server, S is a total observed service rate for the server.

21. A computer program product in a computer readable medium for handling requests, the computer program product comprising:
- first instructions for monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
  
  second instructions for determining an estimated work load at each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates for the plurality of servers are fixed; and
  
  third instructions for forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the instructions for determining an estimated work load comprises instructions for dividing a total of service times for a server by a total of observed inter-arrival times for the server, wherein an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server.

22. A computer program product in a computer readable medium for handling requests, the computer program product comprising:
- first instructions for monitoring processing of requests received at a server system, wherein the server system includes a plurality of servers;
  
  second instructions for determining an estimated work load for each of the plurality of servers based on a service rate for each of the plurality of servers, wherein the plurality of servers have different service rates, wherein the different service rates arc variable; and
  
  third instructions for forwarding the request to a server within the plurality of servers having a smallest estimated work load, wherein the instructions for determining an estimated work load includes instructions for using an equation;
  
  $\frac{(N) \cdot (A)}{(I) \cdot (S)}$ wherein N is a number of requests completed by the server, A is an actual amount of work completed by the server, I is a total of observed inter-arrival times for the server, in which an inter-arrival time is a time period between an arrival of a request sent to the server and an arrival time of a previous request sent to the server, S is a total observed service rate for the server.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Bournas, Redha M.
Primary Examiner(s)
Etienne, Ario
Assistant Examiner(s)
JACOBS, LASHONDA T

Application Number

US09/440,227
Time in Patent Office

1,667 Days
Field of Search

709/224, 709/105, 709/200, 709/201, 709/203, 709/217, 709/218, 709/219, 709/223
US Class Current

718/105
CPC Class Codes

G06F 2209/5019 Workload prediction

G06F 9/505 considering the load

Method and apparatus for the load balancing of non-identical servers in a network environment

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

141 Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for the load balancing of non-identical servers in a network environment

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

141 Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links