System and method for throttling service requests having non-uniform workloads
First Claim
1. A method, comprising:
- performing by a computer system that provides storage services to clients;
receiving a plurality of requests to read or write data on behalf of one or more clients;
servicing a portion of the requests, wherein the portion of the plurality of service requests that is serviced is dependent on a maximum request rate;
determining that the rate at which the plurality of requests was received exceeds the maximum request rate;
in response to said determining, adjusting the maximum request rate, wherein said adjusting is dependent on an observed data transfer rate required to satisfy the portion of the requests, and wherein the amount of data transferred in servicing each of the requests in the portion of the requests is non-uniform; and
subsequent to said adjusting;
receiving one or more additional requests to read or write data on behalf of the one or more clients; and
servicing a portion of the additional requests, wherein the portion of the additional requests that is serviced is dependent on the adjusted maximum request rate.
1 Assignment
0 Petitions
Accused Products
Abstract
A system that provides services to clients may receive and service requests, various ones of which may require different amounts of work. The system may determine whether it is operating in an overloaded or underloaded state based on a current work throughput rate, a target work throughput rate, a maximum request rate, or an actual request rate, and may dynamically adjust the maximum request rate in response. For example, if the maximum request rate is being exceeded, the maximum request rate may be raised or lowered, dependent on the current work throughput rate. If the target or committed work throughput rate is being exceeded, but the maximum request rate is not being exceeded, a lower maximum request rate may be proposed. Adjustments to the maximum request rate may be made using multiple incremental adjustments. Service request tokens may be added to a leaky token bucket at the maximum request rate.
-
Citations
35 Claims
-
1. A method, comprising:
performing by a computer system that provides storage services to clients; receiving a plurality of requests to read or write data on behalf of one or more clients; servicing a portion of the requests, wherein the portion of the plurality of service requests that is serviced is dependent on a maximum request rate; determining that the rate at which the plurality of requests was received exceeds the maximum request rate; in response to said determining, adjusting the maximum request rate, wherein said adjusting is dependent on an observed data transfer rate required to satisfy the portion of the requests, and wherein the amount of data transferred in servicing each of the requests in the portion of the requests is non-uniform; and subsequent to said adjusting; receiving one or more additional requests to read or write data on behalf of the one or more clients; and servicing a portion of the additional requests, wherein the portion of the additional requests that is serviced is dependent on the adjusted maximum request rate. - View Dependent Claims (2, 3, 4, 5)
-
6. A system, comprising:
-
one or more processors; and a memory coupled to the one or more processors and storing program instructions that when executed by the one or more processors cause the one or more processors to; receive a plurality of service requests; service a portion of the service requests, wherein the portion of the plurality of service requests that is serviced is dependent on a maximum request rate; determine that the rate at which the plurality of requests was received exceeds the maximum request rate; in response to said determination, adjust the maximum request rate, wherein said adjustment is dependent on an observed rate at which work was performed to service the portion of the service requests, and wherein the amount of work performed to service each of the service requests in the portion of the service requests is non-uniform; and subsequent to said adjust; receive one or more additional service requests; and service a portion of the additional service requests, wherein the portion of the additional service requests that is serviced is dependent on the adjusted maximum request rate. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method, comprising:
performing by a computer system; receiving a plurality of service requests, wherein the amount of work required to satisfy each of the service requests is non-uniform; servicing at least a portion of the service requests, wherein service requests are accepted for servicing at a rate no higher than a maximum request rate; determining that the rate at which work was performed when servicing the at least a portion of the service requests exceeds a target rate for performing work; in response to said determining, adjusting the maximum request rate, wherein said adjusting is dependent on the rate at which work was performed in servicing the at least a portion of the service requests; and subsequent to said adjusting; receiving one or more additional service requests; and servicing at least a portion of the additional service requests, dependent on the adjusted maximum request rate. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26)
-
27. A non-transitory, computer-readable storage medium storing program instructions that when executed on one or more computers cause the one or more computers to perform:
-
receiving and servicing a plurality of service requests, wherein the service requests are received at a rate no higher than a maximum request rate, wherein the amount of work required to satisfy each of the plurality of service requests is non-uniform, and wherein the rate at which work is performed in servicing the plurality of service requests is no higher than a target rate for performing work in servicing service requests; receiving and servicing one or more additional service requests; determining that the rate at which work was performed when servicing the one or more additional service requests is different from the rate at which work is performed in servicing the plurality of service requests; and in response to said determining, adjusting the maximum request rate, wherein said adjusting is dependent on the rate at which work was performed in servicing the one or more additional service requests. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35)
-
Specification