System for balance distribution of requests across multiple servers using dynamic metrics
First Claim
1. A method for allocating a server selected from a plurality of servers to client requests originating over a predefined time interval at a plurality of user accounts, the method comprising:
- collecting a plurality of client requests that arrive within the predefined time interval wherein at least two of said client requests are serviceable by the server and wherein a first of said at least two of said client requests originates at a first user account and a second of said at least two of said client requests originates at a second user account;
determining a first value of a cost metric for a first set of client request-server pairings wherein said first set includes at least one client request-server pair with said server being paired with either said first or said second of said at least two client requests;
determining a second value of a cost metric for a second set of client request-server pairings wherein said second set includes at least one client request-server pair with said server being paired with both said first and said second of said at least two client requests; and
at the end of said time interval distributing said client requests according to one of said first and said second set of client request-server pairings based on said first and second values of said cost metric.
11 Assignments
0 Petitions
Accused Products
Abstract
A system for distributing incoming client requests across multiple servers in a networked client-server computer environment processes all requests as a set that occur within a given time interval and collects information on the attributes of the requests and the resource capability of the servers to dynamically allocate requests in a set to the appropriate servers upon completion of the time interval. Preferably, a request table collects at least two requests incoming within a predetermined time interval, a request examiner routine analyzes each collected request with respect to at least one attribute, a system status monitor collects resource capability information of each server in a resource table and an optimization and allocation process distributes collected requests in the request table across the multiple servers upon completion of said time interval based on an optimization of potential pairings of the requests in the request table with servers in the resource table.
-
Citations
50 Claims
-
1. A method for allocating a server selected from a plurality of servers to client requests originating over a predefined time interval at a plurality of user accounts, the method comprising:
-
collecting a plurality of client requests that arrive within the predefined time interval wherein at least two of said client requests are serviceable by the server and wherein a first of said at least two of said client requests originates at a first user account and a second of said at least two of said client requests originates at a second user account;
determining a first value of a cost metric for a first set of client request-server pairings wherein said first set includes at least one client request-server pair with said server being paired with either said first or said second of said at least two client requests;
determining a second value of a cost metric for a second set of client request-server pairings wherein said second set includes at least one client request-server pair with said server being paired with both said first and said second of said at least two client requests; and
at the end of said time interval distributing said client requests according to one of said first and said second set of client request-server pairings based on said first and second values of said cost metric. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for distributing client requests across a plurality of servers in a client-server networked system, the method comprising:
-
selecting a time window;
collecting client requests arriving within said time window wherein said client requests include at least a first plurality of said client requests that originate at a first user account and at least a second plurality of client requests that originate at a second user account;
determining a first cost metric corresponding to a first set of client request-server pairing wherein at least one server is paired with at least one of said first plurality of said client requests and at least one of said second plurality of client requests;
determining a second cost metric corresponding to a second set of client request-server pairings wherein said second set is characterized by first and second disjoint subsets with all pairings that include client requests originating at the first user account belonging to the first subset and all pairings that include client requests originating at the second user account belonging to the second subset; and
selecting one of said first set of client request-server pairs and said second set of client request-server pairs based on a differential between said first cost metric and said second cost metric.
-
-
7. A system for distributing load within a client-server network, comprising:
-
a plurality of interconnected servers wherein each server is associated with a capability vector having at least one element associated with a resource expected to be requested by at least one of a plurality of incoming client requests;
a dynamic capability vector determining module adapted to generate a dynamic capability vector for each server of said plurality of interconnected servers, said dynamic capability vector representing an update to said capability vector such that the at least one element of the capability vector corresponds to an unused portion of the resource associated with the at least one element and measured at the commencement of one of a sequence of predefined time intervals;
a requirement vector determining module configured to generate a requirement vector for each incoming client request during the one of a sequence of predefined time intervals; and
a load balancing module for selectively pairing said plurality of interconnected servers with one or more of said plurality of incoming client requests so as to minimize a cost metric computed during the one predefined time interval in said sequence of predefined time intervals wherein said cost metric is a function of vector distances between said dynamic capability vectors and said requirement vectors associated with said servers and said client request pairs in said server-client request pairing. - View Dependent Claims (8)
-
-
9. A method for creating a fast lookup table to determine server nodes within a cluster of nodes for servicing a plurality of client requests incoming within a predetermined time interval, the method comprising:
-
a) creating an adaptive request table populated with a set of patterns wherein each pattern is associated with a generic request type that is most likely to be received by said server nodes;
b) upon receiving a client request within said predetermined time interval, finding a match-pattern in said set of patterns that best matches the client request;
c) using the match-pattern to generate a requirements vector for said client request;
d) associating each server with a capability vector that is refreshed with resources available on said each server during said predefined time interval;
e) computing a score metricizing a vector distance between said requirements vector and said capability vector;
f) looking up a server node in the cluster of nodes to service the client request based upon the score, and g) distributing the client request to at least one server node based upon said score.
-
-
10. A method for allocating hosting-service resources to clients in at least one shared server, said method comprising:
-
discovering utilization patterns of said clients;
monitoring said clients to discover said utilization patterns;
providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
modeling dimensions for client user measures and said utilization patterns; and
allocating said resources to said clients dependent on said utilization patterns. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. An apparatus for allocating hosting-service resources to clients in at least one shared server, said apparatus including:
-
means for discovering utilization patterns of said clients;
means for monitoring said clients to discover said utilization patterns;
means for providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
means for modeling dimensions for client user measures and said utilization patterns; and
means for allocating said resources to said clients dependent on said utilization patterns. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27)
-
-
28. A computer program product having a computer readable medium having a computer program recorded therein for allocating hosting-service resources to clients in at least one shared server, said computer program product including:
-
computer program code means for discovering utilization patterns of said clients; and
computer program code means for monitoring said clients to discover said utilization patterns;
computer program code means for providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
computer program code means for modeling dimensions for client user measures and said utilization patterns; and
computer program code means for allocating said resources to said clients dependent on said utilization patterns.
-
-
29. A decision support system for allocating and planning resources in hosting computing services, said decision support system including:
-
means for modeling utilization of resources of one or more servers by clients in response to at least one of utilization patterns of said clients and specified rules regarding quality of service;
means for monitoring said clients to discover said utilization patterns;
means for providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
means for modeling dimensions for client user measures and said utilization patterns; and
means for determining a minimum number of servers for accommodating said clients to ensure a specified minimum quality of service. - View Dependent Claims (30, 31, 32, 33, 34)
-
-
35. A decision support method for allocating and planning resources in hosting computing services, said method comprising:
-
modeling utilization of resources of one or more servers by clients in response to at least one of utilization patterns of said clients and specified rules regarding quality of service;
monitoring said clients to discover said utilization patterns;
providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
modeling dimensions for client user measures and utilization patterns; and
determining a minimum number of servers for accommodating said clients to ensure a specified minimum quality of service.
-
-
36. A computer program product having a computer readable medium having a computer program recorded therein for providing decision support to allocate and plan resources in hosting computing services, said computer program product including:
-
computer program code means for modeling utilization of resources of one or more servers by client in response to at least one of utilization patterns of said clients and specified rules regarding quality of service;
computer program code means for monitoring said clients to discover said utilization patterns;
computer program code means for providing bounds specifying minimum and maximum hosting-service resources for each of said clients;
computer program code means for modeling dimensions for client user measures and said utilization patterns; and
computer program code means for determining a minimum number of servers for accommodating said clients to ensure a specified minimum quality of service. - View Dependent Claims (37, 38, 39)
-
-
40. A method of improving load balancing operations in a computing network using cost metrics, comprising steps of:
-
obtaining cost metrics representing a cost of generating document content;
receiving a request for particular document content;
determining a particular one of a plurality servers which most recently served the requested document content; and
routing the request to a selected one of the plurality of servers, further comprising the steps of;
determining which other one of the plurality of servers is (1) capable of serving the requested document content and (2) most laghtly loaded;
using the obtained cost metrics to compare a first cost of routing the request to the determined one to a second cost of routine the request to the particular one; and
selecting the determined one if the first cost is less than the second cost and selecting the particular one otherwise. - View Dependent Claims (41, 42, 43, 44, 45, 46)
-
-
47. A system for improving load balancing operations in a computing network using cost metrics, comprising:
-
means for obtaining cost metrics representing a cost of generating document content;
means for receiving a request for particular document content;
means for responding to the request using cached content, if available; and
means for routing the request to a selected one of a plurality of server;
when cached content is not available, further comprising;
means for determinig a particular one of the plurality of servers that most recently served the requested document content;
means for determinig which other one of the plurality of servers is (1) capable of serving the requested document content and (2) most lightly loaded;
means for using the obtained cost metrics to compare a first cost of routing the request to the determined one to a second cost of routing the request to the particular one and means for selecting the determined one if the first cost is less than the second cost and selecting the particular one otherwise.
-
-
48. A computer program product for improving load balancing operations in a computing network using cost metrics, the computer program product embodied on one or more computer-readable media and comprising:
-
computer-readable program code means for obtaining cost metrics representing a cost of generating document content;
computer-readable program code means for receiving a request for particular document content;
computer-readable program code means for responding to the request using cached content, if available; and
computer-readable program code means for routing the request to a selected one of a plurality of servers, when cached content is not available, further comprising;
computer-readable program code means for determining a particular one of the plurality of servers that most recently served the requested document content;
computer-readable program code means for determining which other one of the plurality of servers is (1) capable of serving the requested document content and (2) most lightly loaded;
computer-readable program code means for using the obtained cost metrics to compare a first cost of routing the request to the determined one to a second cost of routing the request to the particular one; and
computer-readable program code means for selecting the determined one if the first cost is less than the second cost and selecting the particular one otherwise.
-
-
49. A method of using cost metrics when load balancing incoming content requests in a networking environment, comprising steps of:
-
gathering cost metric information representing a cost of generating document content; and
creating meta-data to convey the cost metric information to a load balancer;
sending the created meta-data to the load balancer;
receiving the sent cost metric information at the load balancer;
upon receiving a request for the document content at the load balancer, using the received cost metric information to route the request to a server selected from a plurality of servers, further compromising the steps of;
using the received cost metric information to determine a first cost of serving the requested document content from a particular one of the plurality of servers that most recently served the requested document content;
using the received cost metric information to determine a second cost of serving the requested document content from a different one of the plurality of servers, wherein the received cost metric information indicates that the different one of the plurality of servers which is capable of serving the request document content at least cost; and
selecting the particular one if the first cost is less than the second cost and selecting the different one otherwise. - View Dependent Claims (50)
-
Specification