Throughput-based fan-out control in scalable distributed data stores
First Claim
1. A method, comprising:
- determining a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters;
using the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; and
when a query of the graph database is received, processing the query on a computer system by;
selecting a number of clusters in the multiple clusters for fanning out of the query, based on the expected QPS and one or more throughput limits for the one or more components; and
transmitting the query to one or more storage nodes in the selected number of clusters.
2 Assignments
0 Petitions
Accused Products
Abstract
The disclosed embodiments provide a system for processing data. During operation, the system determines a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters. Next, the system uses the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters. The system then selects a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput limits for the one or more components. Finally, the system transmits the query to one or more of the storage nodes in the selected number of clusters.
13 Citations
20 Claims
-
1. A method, comprising:
-
determining a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; using the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; and when a query of the graph database is received, processing the query on a computer system by; selecting a number of clusters in the multiple clusters for fanning out of the query, based on the expected QPS and one or more throughput limits for the one or more components; and transmitting the query to one or more storage nodes in the selected number of clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An apparatus, comprising:
-
one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the apparatus to; determine a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; use the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; select a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput limits for the one or more components; and transmit the query to one or more of the storage nodes in the selected number of clusters. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A system, comprising:
-
a measurement mechanism comprising a non-transitory computer-readable medium comprising instructions that, when executed, cause the system to determine a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; and a client node comprising a non-transitory computer-readable medium comprising instructions that, when executed, cause the system to; use the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; select a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput limits for the one or more components; and transmit the query to one or more of the storage nodes in the selected number of clusters. - View Dependent Claims (20)
-
Specification