Distributed storage system with replica location selection
First Claim
Patent Images
1. A system comprising:
- a plurality of computing clusters each comprising computer memory and a computer processor;
a distributed database running on at least a subset of the plurality of the computing clusters and that interacts with a client application running on a client computer, the distributed database configured to;
store data of the distributed database in shards distributed among computing clusters of the distributed database; and
use each computing cluster of the computing clusters of the distributed database according to a respective role assigned to the computing cluster that identifies functions of the computing cluster; and
an activity monitor service configured to;
monitor interactions between the client application and the distributed database;
generate, from the monitoring of the interactions between the client application and the distributed database, historical workload data describing historical interactions that have occurred between the client application and the distributed database; and
a task assigning service configured to;
receive an indication that a first number (N) of the computing clusters are to be assigned to a replica role of the distributed database;
receive an indication that a second number (M) of the replica-role assigned computing clusters are to be assigned as a subset of the first number (N) of the computer clusters to a voting role of the distributed database;
select, using the historical workload data describing interactions that have occurred between the client application and the distributed database, N computing clusters to be included in the distributed database;
assign the N selected computing clusters to a replica role within the distributed database; and
assign M of the selected subset of computing clusters to a voting role within the distributed database.
2 Assignments
0 Petitions
Accused Products
Abstract
Replicas are selected in a large distributed network, and the roles for these replicas are identified. In one example, an indication that a number N of clusters are to be assigned a replica role and a second number M of the replica-role assigned clusters are to be assigned to a voting role. N computing clusters are selected using workload data, and M of the clusters are assigned to a voting role.
68 Citations
20 Claims
-
1. A system comprising:
-
a plurality of computing clusters each comprising computer memory and a computer processor; a distributed database running on at least a subset of the plurality of the computing clusters and that interacts with a client application running on a client computer, the distributed database configured to; store data of the distributed database in shards distributed among computing clusters of the distributed database; and use each computing cluster of the computing clusters of the distributed database according to a respective role assigned to the computing cluster that identifies functions of the computing cluster; and an activity monitor service configured to; monitor interactions between the client application and the distributed database; generate, from the monitoring of the interactions between the client application and the distributed database, historical workload data describing historical interactions that have occurred between the client application and the distributed database; and a task assigning service configured to; receive an indication that a first number (N) of the computing clusters are to be assigned to a replica role of the distributed database; receive an indication that a second number (M) of the replica-role assigned computing clusters are to be assigned as a subset of the first number (N) of the computer clusters to a voting role of the distributed database; select, using the historical workload data describing interactions that have occurred between the client application and the distributed database, N computing clusters to be included in the distributed database; assign the N selected computing clusters to a replica role within the distributed database; and assign M of the selected subset of computing clusters to a voting role within the distributed database. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
monitoring interactions between a client application and a distributed database; generating from the monitoring of the interactions between the client application and the distributed database, historical workload data describing historical interactions that have occurred between the client application and the distributed database; receive an indication that a first number (N) of the computing clusters are to be assigned to a replica role of the distributed database; receive an indication that a second number (M) of the replica-role assigned computing clusters are to be assigned as a subset of the first number (N) of the computer clusters to a voting role of the distributed database; select, using the historical workload data describing interactions that have occurred between the client application and the distributed database, N computing clusters to be included in the distributed database; assign the N selected computing clusters to a replica role within the distributed database; and assign M of the selected subset of computing clusters to a voting role within the distributed database. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A system comprising:
-
a plurality of computing clusters each comprising computer memory and a computer processor; a distributed database running on at least a subset of the plurality of the computing clusters and that interacts with a client application running on a client computer, the distributed database configured to; store data of the distributed database in shards distributed among computing clusters of the distributed database; and use each computing cluster of the computing clusters of the distributed database according to a respective role assigned to the computing cluster that identifies functions of the computing cluster; and an activity monitor service configured to; monitor interactions between the client application and the distributed database; generate, from the monitoring of the interactions between the client application and the distributed database, workload data describing the interactions between the client application and the distributed database; and a task assigning service configured to; receive an indication that a number (M) of the computing clusters are to be assigned to a voting role of the distributed database; for each particular computer cluster of at least some of the computer clusters; consider the particular computing cluster as a candidate leader; identify, using the workload data, (M+1)/2 computer clusters having (M+1)/2 lowest latencies with the particular computing cluster as voters corresponding to the candidate leader; identify M−
(M+1)/2 unidentified computing clusters as voters corresponding to the candidate leader;identify a number (N) of unidentified computing clusters as replicas corresponding to the candidate leader; select the candidate leader computing cluster, corresponding voters, and corresponding replicas having a best score on a metric; assigning, to the selected candidate computer cluster, a leader role within the distributed database; assigning, to the selected M computer clusters, the voting role within the distributed database; and assigning, to the selected N computer clusters, the replica role within the distributed database. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A method comprising:
-
monitoring interactions between a client application and a distributed database; generating from the monitoring of the interactions between the client application and the distributed database, workload data describing the interactions between the client application and the distributed database; receiving an indication that a number (M) of the computing clusters are to be assigned to a voting role of the distributed database; for each particular computer cluster of at least some of the computer clusters; considering the particular computing cluster as a candidate leader; identifying, using the workload data, (M+1)/2 computer clusters having (M+1)/2 lowest latencies with the particular computing cluster as voters corresponding to the candidate leader; identifying M−
(M+1)/2 unidentified computing clusters as voters corresponding to the candidate leader;identifying a number (N) of unidentified computing clusters as replicas corresponding to the candidate leader; selecting the candidate leader computing cluster, corresponding voters, and corresponding replicas having a best score on a metric; assigning, to the selected candidate computer cluster, a leader role within the distributed database; assigning, to the selected M computer clusters, the voting role within the distributed database; and assigning, to the selected N computer clusters, the replica role within the distributed database. - View Dependent Claims (17, 18, 19, 20)
-
Specification