System for balance distribution of requests across multiple servers using dynamic metrics

US 20060036743A1
Filed: 08/12/2005
Published: 02/16/2006
Est. Priority Date: 01/18/2000
Status: Active Grant

First Claim

Patent Images

1. A method for allocating a server selected from a plurality of servers to client requests originating over a predefined time interval at a plurality of user accounts, the method comprising:

collecting a plurality of client requests that arrive within the predefined time interval wherein at least two of said client requests are serviceable by the server and wherein a first of said at least two of said client requests originates at a first user account and a second of said at least two of said client requests originates at a second user account;

determining a first value of a cost metric for a first set of client request-server pairings wherein said first set includes at least one client request-server pair with said server being paired with either said first or said second of said at least two client requests;

determining a second value of a cost metric for a second set of client request-server pairings wherein said second set includes at least one client request-server pair with said server being paired with both said first and said second of said at least two client requests; and

at the end of said time interval distributing said client requests according to one of said first and said second set of client request-server pairings based on said first and second values of said cost metric.

View all claims

11 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for distributing incoming client requests across multiple servers in a networked client-server computer environment processes all requests as a set that occur within a given time interval and collects information on the attributes of the requests and the resource capability of the servers to dynamically allocate requests in a set to the appropriate servers upon completion of the time interval. Preferably, a request table collects at least two requests incoming within a predetermined time interval, a request examiner routine analyzes each collected request with respect to at least one attribute, a system status monitor collects resource capability information of each server in a resource table and an optimization and allocation process distributes collected requests in the request table across the multiple servers upon completion of said time interval based on an optimization of potential pairings of the requests in the request table with servers in the resource table.

Citations

50 Claims

1. A method for allocating a server selected from a plurality of servers to client requests originating over a predefined time interval at a plurality of user accounts, the method comprising:
- collecting a plurality of client requests that arrive within the predefined time interval wherein at least two of said client requests are serviceable by the server and wherein a first of said at least two of said client requests originates at a first user account and a second of said at least two of said client requests originates at a second user account;
  
  determining a first value of a cost metric for a first set of client request-server pairings wherein said first set includes at least one client request-server pair with said server being paired with either said first or said second of said at least two client requests;
  
  determining a second value of a cost metric for a second set of client request-server pairings wherein said second set includes at least one client request-server pair with said server being paired with both said first and said second of said at least two client requests; and
  
  at the end of said time interval distributing said client requests according to one of said first and said second set of client request-server pairings based on said first and second values of said cost metric.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 wherein the step of determining the value of a cost metric for a set of client request-server pairings comprises the steps of:
    - at the commencement of said predefined time interval, initializing a cumulative value to zero;
      
      for each client request-server pair in the set of client request-server pairings, a) creating a requirement vector corresponding to said client request;
      
      b) creating a capability vector corresponding to said server;
      
      c) calculating an inner product of said requirement vector and said capability vector and adding said inner product to the cumulative value and repeating steps a), b) and c) for all client request-server pairs in the set of client request-server pairings whereupon said cumulative value represents the value of the cost metric.
  - 3. The method of claim 1 wherein the step of distributing said client requests further comprises distributing said client requests according to said first set of client requests-server pairings if said first value of the cost metric is lower than the second value of the cost metric otherwise distributing said client requests according to said second set of client requests-server pairings.
  - 4. The method of claim 1 wherein the step of determining a value of a cost metric for a set of client request-server pairings further comprises the steps of:
    - initializing a set of client request-server pairings at a commencement of the predefined time interval;
      
      a) selecting a client request-server pair to satisfy a selection criteria;
      
      b) creating a requirement vector corresponding to said client request;
      
      c) creating a capability vector corresponding to said server;
      
      d) calculating a distance between the requirement vector and the capability vector and adding said distance to the cumulative value when said distance exceeds a match threshold value and repeating steps a), b), c) and d); and
      
      e) adding said client request-server pair to said set of client request-server pairings when said distance exceeds a match threshold, said cumulative value is less than a cost threshold and said client request has arrived within said predefined time interval.
  - 5. The method of claim 4 wherein said selection criteria comprises matching a client request with a server to generate at least one client request-server pairing belonging to one of said first set and said second set.

6. A method for distributing client requests across a plurality of servers in a client-server networked system, the method comprising:
- selecting a time window;
  
  collecting client requests arriving within said time window wherein said client requests include at least a first plurality of said client requests that originate at a first user account and at least a second plurality of client requests that originate at a second user account;
  
  determining a first cost metric corresponding to a first set of client request-server pairing wherein at least one server is paired with at least one of said first plurality of said client requests and at least one of said second plurality of client requests;
  
  determining a second cost metric corresponding to a second set of client request-server pairings wherein said second set is characterized by first and second disjoint subsets with all pairings that include client requests originating at the first user account belonging to the first subset and all pairings that include client requests originating at the second user account belonging to the second subset; and
  
  selecting one of said first set of client request-server pairs and said second set of client request-server pairs based on a differential between said first cost metric and said second cost metric.

7. A system for distributing load within a client-server network, comprising:
- a plurality of interconnected servers wherein each server is associated with a capability vector having at least one element associated with a resource expected to be requested by at least one of a plurality of incoming client requests;
  
  a dynamic capability vector determining module adapted to generate a dynamic capability vector for each server of said plurality of interconnected servers, said dynamic capability vector representing an update to said capability vector such that the at least one element of the capability vector corresponds to an unused portion of the resource associated with the at least one element and measured at the commencement of one of a sequence of predefined time intervals;
  
  a requirement vector determining module configured to generate a requirement vector for each incoming client request during the one of a sequence of predefined time intervals; and
  
  a load balancing module for selectively pairing said plurality of interconnected servers with one or more of said plurality of incoming client requests so as to minimize a cost metric computed during the one predefined time interval in said sequence of predefined time intervals wherein said cost metric is a function of vector distances between said dynamic capability vectors and said requirement vectors associated with said servers and said client request pairs in said server-client request pairing.
- View Dependent Claims (8)
- - 8. The system of claim 7 wherein said load balancing module further comprises a plurality of instances of load balancing modules resident on an appropriate plurality of servers disposed at intermediate nodes forming a connectivity hierarchy of layers throughout said client-server network such that said cost metric is computed and minimized for at least one layer of server nodes corresponding to the same connectivity hierarchy whereby each incoming client request is satisfied by a plurality of servers and transmission paths.

9. A method for creating a fast lookup table to determine server nodes within a cluster of nodes for servicing a plurality of client requests incoming within a predetermined time interval, the method comprising:
- a) creating an adaptive request table populated with a set of patterns wherein each pattern is associated with a generic request type that is most likely to be received by said server nodes;
  
  b) upon receiving a client request within said predetermined time interval, finding a match-pattern in said set of patterns that best matches the client request;
  
  c) using the match-pattern to generate a requirements vector for said client request;
  
  d) associating each server with a capability vector that is refreshed with resources available on said each server during said predefined time interval;
  
  e) computing a score metricizing a vector distance between said requirements vector and said capability vector;
  
  f) looking up a server node in the cluster of nodes to service the client request based upon the score, and g) distributing the client request to at least one server node based upon said score.

10. A method for allocating hosting-service resources to clients in at least one shared server, said method comprising:
- discovering utilization patterns of said clients;
  
  monitoring said clients to discover said utilization patterns;
  
  providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
  
  modeling dimensions for client user measures and said utilization patterns; and
  
  allocating said resources to said clients dependent on said utilization patterns.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 11. The method according to claim 10, further including packing said clients using stochastic vectors.
  - 12. The method according to claim 11, wherein said packing utilizes at least one of a Roof Avoidance process, a Minimized Variance process, a Maximized Minima process, and a Largest Combination process.
  - 13. The method according to claim 10, wherein said hosting-service resources relate to at least one hosting service comprising one of collaborative hosting services, commerce hosting services, and e-business hosting services.
  - 14. The method according to claim 10, wherein said allocating affects a Quality of Service (QoS) guarantee.
  - 15. The method according to claim 10, wherein said utilization patterns are dependent upon access rates of one or more websites, said access rates have periodicity on multiple time scales.
  - 16. The method according to claim 15, wherein two or more clients are selected from a plurality of clients on the basis of complementarity, wherein said hosting-service resources are allocated to said selected two or more clients as a combination.
  - 17. The method according to claim 16, wherein said allocating comprises selecting said two or more clients to be allocated to a server, said two or more selected clients each having a peak load that is substantially disjoint in time in relation to a peak load of the remaining other selected clients.
  - 18. The method according to claim 16, wherein said allocated hosting-service resources include resources allocated exclusively to each of said selected two or more clients and shared resources allocated to said combination for use by said selected two or more clients.
  - 19. The method according to claim 17, wherein N clients are selected and allocated to a server, N being an integer greater than or equal to two, said server being partitioned into N virtual servers, each client being exclusively allocated a corresponding one of said N virtual servers, excess capacity of said server beyond the capacity required to provide said N virtual servers is shared by said N clients.

20. An apparatus for allocating hosting-service resources to clients in at least one shared server, said apparatus including:
- means for discovering utilization patterns of said clients;
  
  means for monitoring said clients to discover said utilization patterns;
  
  means for providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
  
  means for modeling dimensions for client user measures and said utilization patterns; and
  
  means for allocating said resources to said clients dependent on said utilization patterns.
- View Dependent Claims (21, 22, 23, 24, 25, 26, 27)
- - 21. The apparatus according to claim 20, wherein said hosting-service resources relate to at least one hosting service comprising one of collaborative hosting services, commerce hosting services, and e-business hosting services.
  - 22. The apparatus according to claim 20, wherein said allocating means affects a Quality of Service (QoS) guarantee.
  - 23. The apparatus according to claim 20, wherein said utilization patterns are dependent upon access rates of one or more websites, said access rates have periodicity on multiple time scales.
  - 24. The apparatus according to claim 23, wherein two or more clients are selected from a plurality of clients on the basis of complementarity, wherein said hosting-service resources are allocated to said selected two or more clients as a combination.
  - 25. The apparatus according to claim 24, wherein said allocating means includes means for selecting said two or more clients to be allocated to a server, said two or more selected clients each having a peak load that is substantially disjoint in time in relation to a peak load of the remaining other selected clients.
  - 26. The apparatus according to claim 25, wherein N clients are selected and allocated to a server, N being an integer greater than or equal to two, said server being partitioned into N virtual servers, each client being exclusively allocated a corresponding one of said N virtual servers, excess capacity of said server beyond the capacity required to provide said N virtual servers is shared by said N clients.
  - 27. The apparatus according to claim 24, wherein said allocated hosting-service resources include resources allocated exclusively to each of said selected two or more clients and shared resources allocated to said combination for use by said selected two or more clients.

28. A computer program product having a computer readable medium having a computer program recorded therein for allocating hosting-service resources to clients in at least one shared server, said computer program product including:
- computer program code means for discovering utilization patterns of said clients; and
  
  computer program code means for monitoring said clients to discover said utilization patterns;
  
  computer program code means for providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
  
  computer program code means for modeling dimensions for client user measures and said utilization patterns; and
  
  computer program code means for allocating said resources to said clients dependent on said utilization patterns.

29. A decision support system for allocating and planning resources in hosting computing services, said decision support system including:
- means for modeling utilization of resources of one or more servers by clients in response to at least one of utilization patterns of said clients and specified rules regarding quality of service;
  
  means for monitoring said clients to discover said utilization patterns;
  
  means for providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
  
  means for modeling dimensions for client user measures and said utilization patterns; and
  
  means for determining a minimum number of servers for accommodating said clients to ensure a specified minimum quality of service.
- View Dependent Claims (30, 31, 32, 33, 34)
- - 30. The decision support system according to claim 29, wherein said determining means utilized stochastic vector packing.
  - 31. The decision support system according to claim 29, wherein said system facilitates optimal management of resources in said hosting computing services.
  - 32. The decision support system according to claim 29, wherein said hosting computing services include hosting computing resources, computing applications, computing-related services and network bandwidth.
  - 33. The decision support system according to claim 29, including means for generating for a service provider a set of suggestions for optimal resource planning and allocation.
  - 34. The decision support system according to claim 29, wherein said system provides an optimization service for use in a business model hosting optimization applications.

35. A decision support method for allocating and planning resources in hosting computing services, said method comprising:
- modeling utilization of resources of one or more servers by clients in response to at least one of utilization patterns of said clients and specified rules regarding quality of service;
  
  monitoring said clients to discover said utilization patterns;
  
  providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
  
  modeling dimensions for client user measures and utilization patterns; and
  
  determining a minimum number of servers for accommodating said clients to ensure a specified minimum quality of service.

36. A computer program product having a computer readable medium having a computer program recorded therein for providing decision support to allocate and plan resources in hosting computing services, said computer program product including:
- computer program code means for modeling utilization of resources of one or more servers by client in response to at least one of utilization patterns of said clients and specified rules regarding quality of service;
  
  computer program code means for monitoring said clients to discover said utilization patterns;
  
  computer program code means for providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
  
  computer program code means for modeling dimensions for client user measures and said utilization patterns; and
  
  computer program code means for determining a minimum number of servers for accommodating said clients to ensure a specified minimum quality of service.
- View Dependent Claims (37, 38, 39)
- - 37. The computer program product according to claim 36, wherein said computer program code means for determining utilizes stochastic vector packing.
  - 38. The computer program product according to claim 36, wherein said computer program product facilitates optimal management of resources in said hosting computing services.
  - 39. The computer program product according to claim 36, wherein said hosting computing services include hosting computing resources, computing applications, computing-related services, and network bandwidth.

40. A method of improving load balancing operations in a computing network using cost metrics, comprising steps of:
- obtaining cost metrics representing a cost of generating document content;
  
  receiving a request for particular document content;
  
  determining a particular one of a plurality servers which most recently served the requested document content; and
  
  routing the request to a selected one of the plurality of servers, further comprising the steps of;
  
  determining which other one of the plurality of servers is (1) capable of serving the requested document content and (2) most laghtly loaded;
  
  using the obtained cost metrics to compare a first cost of routing the request to the determined one to a second cost of routine the request to the particular one; and
  
  selecting the determined one if the first cost is less than the second cost and selecting the particular one otherwise.
- View Dependent Claims (41, 42, 43, 44, 45, 46)
- - 41. The method according to claim 40, wherein the first cost and the second cost include a current load on the determined one and the particular one, respectively.
  - 42. The method according to claim 40, wherein:
    - the obtaining step further comprises the step of receiving meta-data which conveys the cost metrics for the document content; and
      
      the using step further comprises the step of using the cost metrics from the received meta-data.
  - 43. The method according to claim 42, wherein the received meta-data comprises a HyperText Transfer Protocol (“
    - HTTP”
      
      ) response header.
  - 44. The method according to claim 42, wherein the received meta-data comprises a plurality of HyperText Transfer Protocol (“
    - HTTP”
      
      ) response headers, each of the headers conveying an element of the cost metric for a particular document content.
  - 45. The method according to claim 42, wherein the syntax comprises a specially denoted comment.
  - 46. The method according to claim 40, wherein the markup language is XML (“
    - Extensible Markup Language”
      
      ).

47. A system for improving load balancing operations in a computing network using cost metrics, comprising:
- means for obtaining cost metrics representing a cost of generating document content;
  
  means for receiving a request for particular document content;
  
  means for responding to the request using cached content, if available; and
  
  means for routing the request to a selected one of a plurality of server;
  
  when cached content is not available, further comprising;
  
  means for determinig a particular one of the plurality of servers that most recently served the requested document content;
  
  means for determinig which other one of the plurality of servers is (1) capable of serving the requested document content and (2) most lightly loaded;
  
  means for using the obtained cost metrics to compare a first cost of routing the request to the determined one to a second cost of routing the request to the particular one and means for selecting the determined one if the first cost is less than the second cost and selecting the particular one otherwise.

48. A computer program product for improving load balancing operations in a computing network using cost metrics, the computer program product embodied on one or more computer-readable media and comprising:
- computer-readable program code means for obtaining cost metrics representing a cost of generating document content;
  
  computer-readable program code means for receiving a request for particular document content;
  
  computer-readable program code means for responding to the request using cached content, if available; and
  
  computer-readable program code means for routing the request to a selected one of a plurality of servers, when cached content is not available, further comprising;
  
  computer-readable program code means for determining a particular one of the plurality of servers that most recently served the requested document content;
  
  computer-readable program code means for determining which other one of the plurality of servers is (1) capable of serving the requested document content and (2) most lightly loaded;
  
  computer-readable program code means for using the obtained cost metrics to compare a first cost of routing the request to the determined one to a second cost of routing the request to the particular one; and
  
  computer-readable program code means for selecting the determined one if the first cost is less than the second cost and selecting the particular one otherwise.

49. A method of using cost metrics when load balancing incoming content requests in a networking environment, comprising steps of:
- gathering cost metric information representing a cost of generating document content; and
  
  creating meta-data to convey the cost metric information to a load balancer;
  
  sending the created meta-data to the load balancer;
  
  receiving the sent cost metric information at the load balancer;
  
  upon receiving a request for the document content at the load balancer, using the received cost metric information to route the request to a server selected from a plurality of servers, further compromising the steps of;
  
  using the received cost metric information to determine a first cost of serving the requested document content from a particular one of the plurality of servers that most recently served the requested document content;
  
  using the received cost metric information to determine a second cost of serving the requested document content from a different one of the plurality of servers, wherein the received cost metric information indicates that the different one of the plurality of servers which is capable of serving the request document content at least cost; and
  
  selecting the particular one if the first cost is less than the second cost and selecting the different one otherwise.
- View Dependent Claims (50)
- - 50. The method according to claim 49, wherein the gathered cost metric information comprises at least one of:
    - (1) processing time at one of the plurality of servers which is an origin server;
      
      (2) network costs from the origin server to one or more ones of the plurality of servers which are backend enterprise servers;
      
      (3) processing time at the backend enterprise servers; and
      
      (4) a cost of delivering the generated document content to a proxy or cache.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
RPX Corporation
Original Assignee
Galactic Computing Corp.
Inventors
Engel, Stephen J., Deng, Yuefan, O'Brien, Thomas, Giustozzi, Joseph

Granted Patent

US 8,302,100 B2
Time in Patent Office

Days
Field of Search
US Class Current

709/227
CPC Class Codes

G06F 11/008   Reliability or availability...

G06F 11/3442   for planning or managing th...

G06F 2209/501   Performance criteria

G06F 2209/503   Resource availability

G06F 9/5044   considering hardware capabi...

G06F 9/505   considering the load

G06F 9/5083   Techniques for rebalancing ...

H04L 67/1001   for accessing one among a p...

H04L 67/10015   Access to distributed or re...

H04L 67/1008   based on parameters of serv...

H04L 67/101   based on network conditions

H04L 67/1021   based on client or server l...

System for balance distribution of requests across multiple servers using dynamic metrics

First Claim

11 Assignments

0 Petitions

Accused Products

Abstract

Citations

50 Claims

Specification

Solutions

Use Cases

Quick Links

System for balance distribution of requests across multiple servers using dynamic metrics

First Claim

11 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

50 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links