Load shedding techniques for distributed services with persistent client connections to ensure quality of service
First Claim
Patent Images
1. A system, comprising:
- a plurality of access nodes (ANs) of a file storage service implemented at a provider network, including a first AN in a first AN peer group (APG) collectively responsible for processing received client requests directed to at least a first file system instance;
wherein the first AN implements a plurality of load shedding analysis iterations (LSAIs), wherein a particular LSAI of the plurality of LSAIs comprises;
determining that a workload level associated with one or more persistent client connections (PCCs) meets a triggering condition for initiating load shedding, wherein each PCC of the one or more PCCs is established to process a plurality of client requests;
selecting at least a first PCC of the one or more PCCs as a candidate for termination;
examining a workload metric cache associated with the first APG, wherein the cache is populated based at least in part on one or more updates received from a workload information distributor, and wherein the cache comprises one or more metrics including at least a thread pool utilization metric of a different AN of the first APG;
determining, based at least in part on said examining, that one or more ANs of the APG meet an available-capacity criterion; and
initiating, based at least in part on said determining that one or more ANs of the APG meet the available-capacity criterion, a phased termination of the first PCC, wherein the phased termination comprises allowing completion of in-flight requests on the first PCC, and rejecting new requests on the first PCC before the phased termination of the first PCC is completed.
1 Assignment
0 Petitions
Accused Products
Abstract
An access node of a distributed service collects workload data pertaining to at least one peer group of access nodes established for handling client requests. During a particular load shedding analysis, the access node uses the collected metrics to detect that a triggering condition for load shedding with respect to a set of persistent client connections has been met. Each persistent client connection is set up to be usable for a plurality of client requests. The access node initiates a phased termination of at least one selected persistent client connection. The phased termination comprises allowing completion of in-flight requests on the connection and rejecting new requests on the connection.
-
Citations
22 Claims
-
1. A system, comprising:
-
a plurality of access nodes (ANs) of a file storage service implemented at a provider network, including a first AN in a first AN peer group (APG) collectively responsible for processing received client requests directed to at least a first file system instance; wherein the first AN implements a plurality of load shedding analysis iterations (LSAIs), wherein a particular LSAI of the plurality of LSAIs comprises; determining that a workload level associated with one or more persistent client connections (PCCs) meets a triggering condition for initiating load shedding, wherein each PCC of the one or more PCCs is established to process a plurality of client requests; selecting at least a first PCC of the one or more PCCs as a candidate for termination; examining a workload metric cache associated with the first APG, wherein the cache is populated based at least in part on one or more updates received from a workload information distributor, and wherein the cache comprises one or more metrics including at least a thread pool utilization metric of a different AN of the first APG; determining, based at least in part on said examining, that one or more ANs of the APG meet an available-capacity criterion; and initiating, based at least in part on said determining that one or more ANs of the APG meet the available-capacity criterion, a phased termination of the first PCC, wherein the phased termination comprises allowing completion of in-flight requests on the first PCC, and rejecting new requests on the first PCC before the phased termination of the first PCC is completed. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method, comprising:
performing, by a first access node of a file storage service, wherein the first access node is a member of a first group comprising one or more access nodes responsible for processing received client requests directed to at least a first file system instance, a load shedding analysis, wherein the load shedding analysis comprises; detecting that a workload level associated with one or more client connections meets a triggering condition for initiating load shedding, wherein the client connections are established to process a plurality of client requests; determining, based at least in part on examining a workload metric cache comprising one or more metrics associated with other access nodes of the first group, wherein the one or more metrics include a thread pool utilization metric, that the first group meets an available-capacity criterion; and initiating, based at least in part on said determining that the first group meets an available-capacity criterion, a phased termination of at least a particular client connection of the one or more client connections, wherein the phased termination comprises allowing completion of in-flight requests on the particular client connection, and rejecting new requests on the particular client connection before the phased termination of at least the particular client connection is completed. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15)
-
16. A non-transitory computer-accessible storage medium storing program instructions that when executed on one or more processors implement a first access node of a distributed service, wherein the first access node is a member of a first group comprising one or more access nodes responsible for processing received client requests, wherein the first access node is configured to perform a load shedding analysis, wherein the load shedding analysis comprises:
-
collecting one or more workload metrics pertaining to one or more members of one or more groups, including the first group; detecting, based at least in part on an analysis of the one or more workload metrics, that a triggering condition for initiating load shedding with respect to one or more client connections has been met, wherein the one or more client connections are established to process a plurality of client requests; and initiating, based at least in part on said detecting that a triggering condition for initiating load shedding with respect to one or more client connections has been met, a phased termination of at least a particular client connection of the one or more client connections, wherein the phased termination comprises allowing completion of in-flight requests on the particular client connection, and rejecting new requests on the particular client connection before the phased termination of at least the particular client connection is completed. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
Specification