Multi-tenant throttling approaches
First Claim
1. A computer-implemented method of managing access to shared resources in a multi-tenant environment, comprising:
- receiving, by one or more computer systems comprising at least a processor and a non-transitory storage medium storing code executable by the processor, a request associated with a customer, the request requesting an amount of usage of at least one shared resource;
determining, by the processor of the one or more computer systems, a threshold amount based at least in part on an available amount of the at least one shared resource;
comparing the amount of usage in the request to the threshold;
when the amount of usage in the request exceeds the threshold, delaying processing of the request;
when the amount of usage in the request is below the threshold, charging a number of tokens for the amount of usage against a global token bucket stored in the one or more computer systems, the tokens in the global token bucket associated with a unit of usage by a plurality of users in the multi-tenant environment;
determining, by the one or more computer systems, a customer fill rate based at least in part upon a current fill level of the global token bucket, the customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the customer fill rate being set to a maximum rate value when the current fill level is at or above a global maximum threshold and being set to a minimum rate value when the current fill level is at or below a global minimum threshold;
determining a portion of the number of tokens for the request to be charged against the customer token bucket based at least in part upon the customer fill rate and a token utilization rate, the token utilization rate indicating a rate at which the customer utilizes the tokens for usage of the at least one shared resource, wherein the portion of the number of tokens is an improvement over the token utilization rate;
charging the portion of the number of tokens against the customer token bucket; and
allowing the usage of the at least one shared resource for the request when the number of tokens at least meeting the portion of the number of tokens for the request is available in the customer token bucket.
1 Assignment
0 Petitions
Accused Products
Abstract
An opportunistic throttling approach can be used for customers of shared resources in a multi-tenant environment. Each customer can have a respective token bucket with a guaranteed fill rate. When a request is received for an amount of work to be performed by a resource, the corresponding number of tokens are obtained from, or charged against, a global token bucket. If the global bucket has enough tokens, and if the customer has not exceeded a maximum work rate or other such metric, the customer can charge less than the full number of tokens against the customer'"'"'s token bucket, in order to reduce the number of tokens that need to be taken from the customer bucket. Such an approach can enable the customer to do more work and enable the customer'"'"'s bucket to fill more quickly as fewer tokens are charged against the customer bucket for the same amount of work.
-
Citations
27 Claims
-
1. A computer-implemented method of managing access to shared resources in a multi-tenant environment, comprising:
-
receiving, by one or more computer systems comprising at least a processor and a non-transitory storage medium storing code executable by the processor, a request associated with a customer, the request requesting an amount of usage of at least one shared resource; determining, by the processor of the one or more computer systems, a threshold amount based at least in part on an available amount of the at least one shared resource; comparing the amount of usage in the request to the threshold; when the amount of usage in the request exceeds the threshold, delaying processing of the request; when the amount of usage in the request is below the threshold, charging a number of tokens for the amount of usage against a global token bucket stored in the one or more computer systems, the tokens in the global token bucket associated with a unit of usage by a plurality of users in the multi-tenant environment; determining, by the one or more computer systems, a customer fill rate based at least in part upon a current fill level of the global token bucket, the customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the customer fill rate being set to a maximum rate value when the current fill level is at or above a global maximum threshold and being set to a minimum rate value when the current fill level is at or below a global minimum threshold; determining a portion of the number of tokens for the request to be charged against the customer token bucket based at least in part upon the customer fill rate and a token utilization rate, the token utilization rate indicating a rate at which the customer utilizes the tokens for usage of the at least one shared resource, wherein the portion of the number of tokens is an improvement over the token utilization rate; charging the portion of the number of tokens against the customer token bucket; and allowing the usage of the at least one shared resource for the request when the number of tokens at least meeting the portion of the number of tokens for the request is available in the customer token bucket. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method, comprising:
-
receiving, by a computing system comprising a processor and a non-transitory storage medium storing code executable by the processor, a request associated with a customer, the request requesting usage of at least one resource; determining, by the processor of the computing system, a threshold amount based at least in part on an available amount of the at least one resource; comparing the usage in the request to the threshold; when the usage in the request exceeds the threshold, delaying processing of the request; when the usage in the request is below the threshold, determining, by the processor of the computing system, a number of tokens needed to process the request, the tokens associated with an amount of usage of a resource in a multi-tenant environment; charging the number of tokens against a global token bucket associated with a plurality of users for the resource, the global token bucket being stored in storage; determining, via a processor, a portion of the number of tokens to be charged against a customer token bucket associated with the customer for the request based at least in part upon a fill level of the global token bucket and a token utilization rate, the token utilization rate indicating a rate at which the customer utilizes the tokens for usage of the at least one resource; and allowing the usage of the at least one shared resource for the request when at least the portion of the number of tokens are available in the customer token bucket. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computing system, comprising:
-
at least one processor; and at least one memory device including instructions that, when executed by the at least one processor, cause the computing system to; receive a request associated with a customer, the request requesting usage of at least one resource; determine a threshold amount based at least in part on an available amount of the at least one resource; compare the usage in the request to the threshold; when usage in the request exceeds the threshold, delay processing of the request; when the usage in the request is below a maximum threshold, determine a number of tokens needed to process a task on behalf of a customer, the tokens associated with an amount of usage of at least one resource; charge the number of tokens against a global token bucket associated with the at least one resource based at least in part on a customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the global token bucket stored in a storage of the computing system; determine, by the at least one processor, a portion of the number of tokens to be charged against the customer token bucket associated with the customer based at least in part upon a fill level of the global token bucket and a token utilization rate, token utilization rate indicating a rate at which the customer utilizes the tokens for the amount of usage of the at least one resource; and perform the task using the at least one shared resource when at least the portion of the number of tokens is available in the customer token bucket. - View Dependent Claims (21, 22, 23)
-
-
24. A non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor of a computing system, cause the computing system to:
-
receive a request associated with a customer, the request requesting usage of at least one resource; determine a threshold amount based at least in part on an available amount of the at least one resource; compare the usage in the request to the threshold; when the usage in the request exceeds the threshold, delay processing of the request; when the usage in the request is below the threshold, determine a number of tokens needed to process a task on behalf of the customer, the tokens associated with an amount of usage of at least one resource; charge the number of tokens against a global token bucket associated with the at least one resource based at least in part on a customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the global token bucket stored in a storage of the computing system; determine, by the at least one processor, a portion of the number of tokens to be charged against the customer token bucket associated with the customer based at least in part upon a fill level of the global token bucket and a token utilization rate of the customer, the token utilization rate indicating a rate at which the customer utilizes the tokens for the amount of usage of the at least one resource; and perform the task using the at least one shared resource when at least the portion of the number of tokens is available in the customer token bucket. - View Dependent Claims (25, 26, 27)
-
Specification