Resource allocation for multiple applications
First Claim
1. A processor-implemented method for allocating resources to a plurality of applications, wherein the resources include a plurality of servers and at least one of the applications uses a tiered arrangement of servers, comprising:
- gathering instrumentation data for work requests processed by the applications;
determining an associated workload level for work requests processed by the applications;
determining for each application a first application resource requirement as a function of the workload levels and a service level metric associated with the application;
representing each server as a processor-sharing queue having at least one critical resource;
determining respective average response times of each of the tiers, each respective average response time being a function of a number servers in the tier, an arrival rate of work requests, and an average utilization rate of the critical resource;
determining a total average response time as a sum of the respective average response times of each of the tiers;
determining a minimum total number of servers required in each tier for the total average response time of the application to satisfy the service level metric;
determining for each application an assigned subset of resources as a function of the first application resource requirement, wherein the function minimizes communication delays between resources, and satisfies a bandwidth capacity requirement of the application; and
automatically reconfiguring the resources consistent with the assigned subset of resources for each application.
1 Assignment
0 Petitions
Accused Products
Abstract
Method and apparatus for allocating resources to a plurality of applications. In various embodiments instrumentation data may be gathered for work requests processed by the applications. An associated workload level may be determined for work requests processed by the applications. For each application an application resource requirement may be determined as a function of the workload levels and a service level metric associated with the application. For each application an assigned subset of resources may be determined as a function of the application resource requirement, a minimization of communication delays between resources, and a bandwidth capacity requirement of the application. The resources may be automatically reconfigured consistent with the assigned subset of resources for each application.
205 Citations
21 Claims
-
1. A processor-implemented method for allocating resources to a plurality of applications, wherein the resources include a plurality of servers and at least one of the applications uses a tiered arrangement of servers, comprising:
-
gathering instrumentation data for work requests processed by the applications; determining an associated workload level for work requests processed by the applications; determining for each application a first application resource requirement as a function of the workload levels and a service level metric associated with the application; representing each server as a processor-sharing queue having at least one critical resource; determining respective average response times of each of the tiers, each respective average response time being a function of a number servers in the tier, an arrival rate of work requests, and an average utilization rate of the critical resource; determining a total average response time as a sum of the respective average response times of each of the tiers; determining a minimum total number of servers required in each tier for the total average response time of the application to satisfy the service level metric; determining for each application an assigned subset of resources as a function of the first application resource requirement, wherein the function minimizes communication delays between resources, and satisfies a bandwidth capacity requirement of the application; and automatically reconfiguring the resources consistent with the assigned subset of resources for each application. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A processor-implemented method for allocating resources to a plurality of applications, wherein the resources include a plurality of servers and at least one of the applications uses a tiered arrangement of servers, comprising:
-
storing work-request identifier data when a work request is initiated; determining an identity of a completed work request from the work-request identifier data when a work request is complete and storing instrumentation data for identified work requests processed by the applications; classifying the work requests by type of requester and type of work; determining an associated requester-load level for each type of requester; determining an associated workload level for each type of work for work requests processed by the applications; adjusting a load balancing policy as a function of the workload levels and requester-load level, wherein work requests are assigned to the resources according to the load balancing policy; generating for each application a first application resource requirement as a function of the workload levels and a service level metric associated with the application; representing each server as a processor-sharing queue having at least one critical resource; determining respective average response times of each of the tiers, each respective average response time being a function of a number servers in the tier, an arrival rate of work requests, and an average utilization rate of the critical resource; determining a total average response time as a sum of the respective average response times of each of the tiers; determining a minimum total number of servers required in each tier for the total average response time of the application to satisfy the service level metric; determining for each application an assigned subset of resources as a function of the first application resource requirement, wherein the function minimizes communication delays between resources, and satisfies a bandwidth capacity requirement of the application; and automatically reconfiguring the resources consistent with the assigned subset of resources for each application. - View Dependent Claims (7, 8, 9)
-
-
10. An apparatus for allocating resources to a plurality of applications, wherein the resources include a plurality of servers and at least one of the applications uses a tiered arrangement of servers, comprising:
-
means for gathering instrumentation data for work requests processed by the applications; means for determining an associated workload level for work requests processed by the applications; means for generating for each application a first application resource requirement as a function of the workload levels and a service level metric associated with the application; means for representing each server as a processor-sharing queue having at least one critical resource; means for determining respective average response times of each of the tiers, each respective average response time being a function of a number servers in the tier, an arrival rate of work requests, and an average utilization rate of the critical resource; means for determining a total average response time as a sum of the respective average response times of each of the tiers; means for determining a minimum total number of servers required in each tier for the total average response time of the application to satisfy the service level metric; means for determining for each application an assigned subset of resources as a function of the first application resource requirement, wherein the function minimizes communication delays between resources, and satisfies a bandwidth capacity requirement of the application; and means for automatically reconfiguring the resources consistent with the assigned subset of resources for each application. - View Dependent Claims (11, 12)
-
-
13. An article of manufacture for allocating resources to a plurality of applications, wherein the resources include a plurality of servers and at least one of the applications uses a tiered arrangement of servers, comprising:
-
a computer-readable medium configured with instructions for causing a processor-based system to perform the steps of, gathering instrumentation data for work requests processed by the applications; determining an associated workload level for work requests processed by the applications; generating for each application a first application resource requirement as a function of the workload levels and a service level metric associated with the application; representing each server as a processor-sharing queue having at least one critical resource; determining respective average response times of each of the tiers, each respective average response time being a function of a number servers in the tier, an arrival rate of work requests, and an average utilization rate of the critical resource; determining a total average response time as a sum of the respective average response times of each of the tiers; determining a minimum total number of servers required in each tier for the total average response time of the application to satisfy the service level metric; determining for each application an assigned subset of resources as a function of the first application resource requirement, wherein the function minimizes communication delays between resources, and satisfies a bandwidth capacity requirement of the application; and automatically reconfiguring the resources consistent with the assigned subset of resources for each application. - View Dependent Claims (14, 15, 16, 17)
-
-
18. An article of manufacture for allocating resources to a plurality of applications, wherein the resources include a plurality of servers and at least one of the applications uses a tiered arrangement of servers, comprising:
a computer-readable medium configured with instructions for causing a processor-based system to perform the steps of, storing work-request identifier data when a work request is initiated; determining an identity of a completed work request from the work-request identifier data when a work request is complete and storing instrumentation data for identified work requests processed by the applications; classifying the work requests by type of requester and type of work; determining an associated requester-load level for each type of requester; determining an associated workload level for each type of work for work requests processed by the applications; adjusting a load balancing policy as a function of the workload levels and requester-load level, wherein work requests are assigned to the resources according to the load balancing policy; generating for each application a first application resource requirement as a function of the workload levels and a service level metric associated with the application; representing each server as a processor-sharing queue having at least one critical resource; determining respective average response times of each of the tiers, each respective average response time being a function of a number servers in the tier, an arrival rate of work requests, and an average utilization rate of the critical resource; determining a total average response time as a sum of the respective average response times of each of the tiers; determining a minimum total number of servers required in each tier for the total average response time of the application to satisfy the service level metric; determining for each application an assigned subset of resources as a function of the first application resource requirement, wherein the function minimizes communication delays between resources, and satisfies a bandwidth capacity requirement of the application; and automatically reconfiguring the resources consistent with the assigned subset of resources for each application. - View Dependent Claims (19, 20, 21)
Specification