PARTITIONING AND REPLICATING DATA IN SCALABLE DISTRIBUTED DATA STORES
First Claim
1. A method, comprising:
- generating a first distribution of a set of partitions comprising a graph database across a first set of storage nodes in a first cluster;
replicating the graph database by generating a second distribution of the set of partitions across a second set of storage nodes in a second cluster, wherein the second distribution is different from the first distribution; and
when a query of the graph database is received, processing the query on a computer system by;
identifying one or more partitions storing data associated with the query;
using a set of mappings comprising the set of partitions, the first and second sets of storage nodes, and the first and second clusters to select one or more storage nodes containing the one or more partitions; and
transmitting one or more portions of the query to the selected storage nodes.
2 Assignments
0 Petitions
Accused Products
Abstract
The disclosed embodiments provide a system for processing data. During operation, the system generates a first distribution of a set of partitions comprising a graph database across a first set of storage nodes in a first cluster. Next, the system replicates the graph database by generating a second, different distribution of the set of partitions across a second set of storage nodes in a second cluster. The system then identifies one or more partitions storing data associated with the query and uses a set of mappings comprising the set of partitions, the first and second sets of storage nodes, and the first and second clusters to select one or more storage nodes containing the one or more partitions. Finally, the system transmits one or more portions of the query to the selected storage nodes.
27 Citations
20 Claims
-
1. A method, comprising:
-
generating a first distribution of a set of partitions comprising a graph database across a first set of storage nodes in a first cluster; replicating the graph database by generating a second distribution of the set of partitions across a second set of storage nodes in a second cluster, wherein the second distribution is different from the first distribution; and when a query of the graph database is received, processing the query on a computer system by; identifying one or more partitions storing data associated with the query; using a set of mappings comprising the set of partitions, the first and second sets of storage nodes, and the first and second clusters to select one or more storage nodes containing the one or more partitions; and transmitting one or more portions of the query to the selected storage nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An apparatus, comprising:
-
one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the apparatus to; generate a first distribution of a set of partitions comprising a graph database across a first set of storage nodes in a first cluster; replicate the graph database by generating a second distribution of the set of partitions across a second set of storage nodes in a second cluster, wherein the second distribution is different from the first distribution; and when a query of the graph database is received, process the query by; identifying one or more partitions storing data associated with the query; using a set of mappings comprising the set of partitions, the first and second sets of storage nodes, and the first and second clusters to select one or more storage nodes containing the one or more partitions; and transmitting one or more portions of the query to the selected storage nodes. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A system, comprising:
-
a distribution mechanism comprising a non-transitory computer-readable medium comprising instructions that, when executed, cause the system to; generate a first distribution of a set of partitions comprising a graph database across a first set of storage nodes in a first cluster; and replicate the graph database by generating a second distribution of the set of partitions across a second set of storage nodes in a second cluster, wherein the second distribution is different from the first distribution; and a query processor comprising a non-transitory computer-readable medium comprising instructions that, when executed, cause the system to process a query of the graph database by; identifying one or more partitions storing data associated with the query; using a set of mappings comprising the set of partitions, the first and second sets of storage nodes, and the first and second clusters to select one or more storage nodes containing the one or more partitions; and transmitting one or more portions of the query to the selected storage nodes. - View Dependent Claims (19, 20)
-
Specification