Multi-tenant throttling approaches

US 9,413,680 B1
Filed: 09/26/2012
Issued: 08/09/2016
Est. Priority Date: 09/26/2012
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of managing access to shared resources in a multi-tenant environment, comprising:

receiving, by one or more computer systems comprising at least a processor and a non-transitory storage medium storing code executable by the processor, a request associated with a customer, the request requesting an amount of usage of at least one shared resource;

determining, by the processor of the one or more computer systems, a threshold amount based at least in part on an available amount of the at least one shared resource;

comparing the amount of usage in the request to the threshold;

when the amount of usage in the request exceeds the threshold, delaying processing of the request;

when the amount of usage in the request is below the threshold, charging a number of tokens for the amount of usage against a global token bucket stored in the one or more computer systems, the tokens in the global token bucket associated with a unit of usage by a plurality of users in the multi-tenant environment;

determining, by the one or more computer systems, a customer fill rate based at least in part upon a current fill level of the global token bucket, the customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the customer fill rate being set to a maximum rate value when the current fill level is at or above a global maximum threshold and being set to a minimum rate value when the current fill level is at or below a global minimum threshold;

determining a portion of the number of tokens for the request to be charged against the customer token bucket based at least in part upon the customer fill rate and a token utilization rate, the token utilization rate indicating a rate at which the customer utilizes the tokens for usage of the at least one shared resource, wherein the portion of the number of tokens is an improvement over the token utilization rate;

charging the portion of the number of tokens against the customer token bucket; and

allowing the usage of the at least one shared resource for the request when the number of tokens at least meeting the portion of the number of tokens for the request is available in the customer token bucket.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An opportunistic throttling approach can be used for customers of shared resources in a multi-tenant environment. Each customer can have a respective token bucket with a guaranteed fill rate. When a request is received for an amount of work to be performed by a resource, the corresponding number of tokens are obtained from, or charged against, a global token bucket. If the global bucket has enough tokens, and if the customer has not exceeded a maximum work rate or other such metric, the customer can charge less than the full number of tokens against the customer'"'"'s token bucket, in order to reduce the number of tokens that need to be taken from the customer bucket. Such an approach can enable the customer to do more work and enable the customer'"'"'s bucket to fill more quickly as fewer tokens are charged against the customer bucket for the same amount of work.

Citations

27 Claims

1. A computer-implemented method of managing access to shared resources in a multi-tenant environment, comprising:
- receiving, by one or more computer systems comprising at least a processor and a non-transitory storage medium storing code executable by the processor, a request associated with a customer, the request requesting an amount of usage of at least one shared resource;
  
  determining, by the processor of the one or more computer systems, a threshold amount based at least in part on an available amount of the at least one shared resource;
  
  comparing the amount of usage in the request to the threshold;
  
  when the amount of usage in the request exceeds the threshold, delaying processing of the request;
  
  when the amount of usage in the request is below the threshold, charging a number of tokens for the amount of usage against a global token bucket stored in the one or more computer systems, the tokens in the global token bucket associated with a unit of usage by a plurality of users in the multi-tenant environment;
  
  determining, by the one or more computer systems, a customer fill rate based at least in part upon a current fill level of the global token bucket, the customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the customer fill rate being set to a maximum rate value when the current fill level is at or above a global maximum threshold and being set to a minimum rate value when the current fill level is at or below a global minimum threshold;
  
  determining a portion of the number of tokens for the request to be charged against the customer token bucket based at least in part upon the customer fill rate and a token utilization rate, the token utilization rate indicating a rate at which the customer utilizes the tokens for usage of the at least one shared resource, wherein the portion of the number of tokens is an improvement over the token utilization rate;
  
  charging the portion of the number of tokens against the customer token bucket; and
  
  allowing the usage of the at least one shared resource for the request when the number of tokens at least meeting the portion of the number of tokens for the request is available in the customer token bucket.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The computer-implemented method of claim 1, further comprising, when the current fill level is below the global maximum threshold but above the global minimum threshold, setting the customer fill rate to a rate value between the maximum rate value and the minimum rate value.
  - 3. The computer-implemented method of claim 2, wherein the rate value between the maximum value and the minimum value is a function of the current fill level between the global minimum threshold and the global maximum threshold.
  - 4. The computer-implemented method of claim 1, further comprising:
    - causing the global token bucket to be refilled at a global fill rate and the customer bucket to be refilled at a token bucket refill rate.
  - 5. The computer-implemented method of claim 1, further comprising:
    - throttling the request while the number of tokens sufficient for the amount of usage are unable to be charged against the customer bucket.
  - 6. The computer-implemented method of claim 1, wherein the global minimum threshold is zero.

7. A computer-implemented method, comprising:
- receiving, by a computing system comprising a processor and a non-transitory storage medium storing code executable by the processor, a request associated with a customer, the request requesting usage of at least one resource;
  
  determining, by the processor of the computing system, a threshold amount based at least in part on an available amount of the at least one resource;
  
  comparing the usage in the request to the threshold;
  
  when the usage in the request exceeds the threshold, delaying processing of the request;
  
  when the usage in the request is below the threshold, determining, by the processor of the computing system, a number of tokens needed to process the request, the tokens associated with an amount of usage of a resource in a multi-tenant environment;
  
  charging the number of tokens against a global token bucket associated with a plurality of users for the resource, the global token bucket being stored in storage;
  
  determining, via a processor, a portion of the number of tokens to be charged against a customer token bucket associated with the customer for the request based at least in part upon a fill level of the global token bucket and a token utilization rate, the token utilization rate indicating a rate at which the customer utilizes the tokens for usage of the at least one resource; and
  
  allowing the usage of the at least one shared resource for the request when at least the portion of the number of tokens are available in the customer token bucket.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 8. The computer-implemented method of claim 7, wherein tokens are able to be obtained up to a maximum rate for the customer when the fill level of the global bucket at least meets a global fill threshold.
  - 9. The computer-implemented method of claim 8, wherein tokens are able to be obtained up to an intermediate rate for the customer when the fill level of the global bucket is less than the global fill threshold but above a global minimum threshold.
  - 10. The computer-implemented method of claim 9, further comprising:
    - determining the intermediate rate using a function of a fill rate of the global bucket between the global fill threshold and the global minimum threshold.
  - 11. The computer-implemented method of claim 10, wherein the global fill threshold is 50% of a capacity of the global bucket, and wherein the function of the fill rate is a linear function.
  - 12. The computer-implemented method of claim 7, wherein the resource is capable of being accessed by the plurality of users, the plurality of users capable of having a respective customer bucket.
  - 13. The computer-implemented method of claim 12, further comprising:
    - assigning a capacity and a fill rate for the respective customer bucket and the global bucket.
  - 14. The computer-implemented method of claim 13, wherein the capacity of the global bucket is at least as great as the capacity of the customer buckets associated with the resource.
  - 15. The computer-implemented method of claim 7, wherein at least a portion of a capacity of the global bucket is dedicated to resource management traffic.
  - 16. The computer-implemented method of claim 15, wherein tokens are further able to be obtained from a background-opportunistic bucket.
  - 17. The computer-implemented method of claim 7, wherein the request is received to an application programming interface (API) for a type of resource for processing the request.
  - 18. The computer-implemented method of claim 7, wherein the global bucket is one of a plurality of global buckets, the plurality of global buckets associated with a respective set of resources in the multi-tenant environment.
  - 19. The computer-implemented method of claim 18, wherein a plurality of customer buckets are allocated to the customer, the plurality of customer buckets associated with a respective set of resources and one of the plurality of global buckets in the multi-tenant environment.

20. A computing system, comprising:
- at least one processor; and
  
  at least one memory device including instructions that, when executed by the at least one processor, cause the computing system to;
  
  receive a request associated with a customer, the request requesting usage of at least one resource;
  
  determine a threshold amount based at least in part on an available amount of the at least one resource;
  
  compare the usage in the request to the threshold;
  
  when usage in the request exceeds the threshold, delay processing of the request;
  
  when the usage in the request is below a maximum threshold, determine a number of tokens needed to process a task on behalf of a customer, the tokens associated with an amount of usage of at least one resource;
  
  charge the number of tokens against a global token bucket associated with the at least one resource based at least in part on a customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the global token bucket stored in a storage of the computing system;
  
  determine, by the at least one processor, a portion of the number of tokens to be charged against the customer token bucket associated with the customer based at least in part upon a fill level of the global token bucket and a token utilization rate, token utilization rate indicating a rate at which the customer utilizes the tokens for the amount of usage of the at least one resource; and
  
  perform the task using the at least one shared resource when at least the portion of the number of tokens is available in the customer token bucket.
- View Dependent Claims (21, 22, 23)
- - 21. The computing system of claim 20, wherein tokens are able to be obtained at the customer fill rate up to a maximum rate for the customer when the fill level of the global bucket at least meets a global fill threshold.
  - 22. The computing system of claim 20, wherein tokens are able to be obtained at the customer fill rate up to an intermediate rate for the customer when the fill level of the global bucket is less than a global fill threshold but above a global minimum threshold.
  - 23. The computing system of claim 20, wherein the at least one resource is capable of being accessed by the plurality of customers, e the plurality of customers capable of having a respective customer bucket, and wherein the instructions when executed further cause the computing system to:
    - assign a capacity and a fill rate for the respective customer bucket and the global bucket.

24. A non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor of a computing system, cause the computing system to:
- receive a request associated with a customer, the request requesting usage of at least one resource;
  
  determine a threshold amount based at least in part on an available amount of the at least one resource;
  
  compare the usage in the request to the threshold;
  
  when the usage in the request exceeds the threshold, delay processing of the request;
  
  when the usage in the request is below the threshold, determine a number of tokens needed to process a task on behalf of the customer, the tokens associated with an amount of usage of at least one resource;
  
  charge the number of tokens against a global token bucket associated with the at least one resource based at least in part on a customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the global token bucket stored in a storage of the computing system;
  
  determine, by the at least one processor, a portion of the number of tokens to be charged against the customer token bucket associated with the customer based at least in part upon a fill level of the global token bucket and a token utilization rate of the customer, the token utilization rate indicating a rate at which the customer utilizes the tokens for the amount of usage of the at least one resource; and
  
  perform the task using the at least one shared resource when at least the portion of the number of tokens is available in the customer token bucket.
- View Dependent Claims (25, 26, 27)
- - 25. The non-transitory computer-readable storage medium of claim 24, wherein tokens are able to be obtained at the customer fill rate up to a maximum rate for the customer when the fill level of the global bucket at least meets a global fill threshold, andwherein tokens are able to obtained at the customer fill rate up to an intermediate rate for the customer when the fill level of the global bucket is less than the global fill threshold but above a global minimum threshold.
  - 26. The non-transitory computer-readable storage medium of claim 25, wherein the at least one resource is capable of being accessed by the multiple customers, the multiple customers capable of having a respective customer bucket.
  - 27. The non-transitory computer-readable storage medium of claim 26, wherein the instructions when executed further cause the computing system to:
    - assign a capacity and a fill rate for each customer bucket and the global bucket.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Pisolkar, Raghav Vijay, Kusters, Norbert P., Lee, Kerry Q., Certain, Tate Andrew
Primary Examiner(s)
Daftuar, Saket K

Application Number

US13/627,278
Time in Patent Office

1,413 Days
Field of Search

709/223, 709/224, 709/225, 709/226, 709/227, 709/228, 709/229, 370/230, 370/232, 370/233, 370/234, 370/235, 370/235.1, 370/252, 370/310, 370/389, 370/392, 370/395.1, 370/395.2, 370/395.21, 370/395.4, 370/395.41, 370/395.42, 370/395.43, 370/468, 713/171, 726/9, 726/10, 726/5, 726/2
US Class Current

1/1
CPC Class Codes

H04L 12/1403   Architecture for metering, ...

H04L 41/08   Configuration management of...

H04L 47/762   triggered by the network

H04L 63/10   for controlling access to d...

H04L 67/146   Markers for unambiguous ide...

H04L 69/321   Interlayer communication pr...

H04L 69/325   in the network layer [OSI l...

H04M 15/765   Linked or grouped accounts,...

H04M 15/7652   shared by users

H04M 15/7655   shared by technologies

H04M 15/83   Notification aspects

H04M 15/85   characterised by the type o...

H04M 15/853   Calculate maximum communica...

H04M 15/854   Available credit

Multi-tenant throttling approaches

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Multi-tenant throttling approaches

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links