Runtime load balancing of work across a clustered computing system using current service performance levels
First Claim
1. A computer-implemented method for determining how much work to route to computing nodes in a computing system that comprises a plurality of nodes that each hosts a server instance that provides a service that performs work, the method comprising:
- based on a current moving average of a performance metric, from each of a plurality of server instances that provide a particular service, that is associated with the particular service, computing a performance grade for each of the plurality of server instances; and
computing, based on the respective performance grades, a percentage of work to route to each of the plurality of server instances.
1 Assignment
0 Petitions
Accused Products
Abstract
Runtime load balancing of work across a clustered computing system involves servers calculating, and clients utilizing, current service performance grades of each instance in the system. A performance grade for an instance is based on performance metrics for that instance, where the computation used may vary by policy. Examples of possible policies include: (a) using estimated bandwidth as a performance grade, (b) using spare capacity as a performance grade, or (c) using response time as a performance grade. Clients distribute work requests across servers in the system as the requests arrive. Work requests can be distributed according to performance grades, and/or flags associated with the performance grades. Automatically and intelligently directing work requests to the best server instances, based on real-time service performance metrics, minimizes the need to manually relocate work within the clustered system.
105 Citations
18 Claims
-
1. A computer-implemented method for determining how much work to route to computing nodes in a computing system that comprises a plurality of nodes that each hosts a server instance that provides a service that performs work, the method comprising:
-
based on a current moving average of a performance metric, from each of a plurality of server instances that provide a particular service, that is associated with the particular service, computing a performance grade for each of the plurality of server instances; and
computing, based on the respective performance grades, a percentage of work to route to each of the plurality of server instances. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification