CONFIGURATION MANAGEMENT IN DISTRIBUTED DATA SYSTEMS
First Claim
1. A method of obtaining configuration information defining a current configuration of a plurality of data nodes storing replicas of a partition of a database, the method comprising:
- operating at least one processor to perform acts comprising;
receiving a plurality of messages, each message generated by a data node of the plurality of data nodes and indicating a version of the configuration of the database for which the data node is configured and a set of data nodes configured in accordance with the indicated configuration to replicate the partition stored on the data node;
identifying, based on the received messages, a selected set of data nodes, the selected set of data nodes being a set identified in at least one of the plurality of messages for which a quorum of the data nodes in the set each generated a message indicating the same configuration version and the selected set of data nodes; and
storing as a portion of the configuration information an indication that each data node of the selected set is a data node storing a replica of the partition.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for managing configurations of data nodes in a distributed environment A configuration manager is implemented as a set of distributed master nodes that may use quorum-based processing to enable reliable identification of master nodes storing current configuration information, even if some of the master nodes fail. If a quorum of master nodes cannot be achieved or some other event occurs that precludes identification of current configuration information, the configuration manager may be rebuilt by analyzing reports from read/write quorums of nodes associated with a configuration, allowing automatic recovery of data partitions.
96 Citations
20 Claims
-
1. A method of obtaining configuration information defining a current configuration of a plurality of data nodes storing replicas of a partition of a database, the method comprising:
operating at least one processor to perform acts comprising; receiving a plurality of messages, each message generated by a data node of the plurality of data nodes and indicating a version of the configuration of the database for which the data node is configured and a set of data nodes configured in accordance with the indicated configuration to replicate the partition stored on the data node; identifying, based on the received messages, a selected set of data nodes, the selected set of data nodes being a set identified in at least one of the plurality of messages for which a quorum of the data nodes in the set each generated a message indicating the same configuration version and the selected set of data nodes; and storing as a portion of the configuration information an indication that each data node of the selected set is a data node storing a replica of the partition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A database system storing a database comprising a plurality of partitions, the system comprising:
-
a plurality of computing nodes; and a network communicably interconnecting the plurality of computing nodes, wherein, the plurality of computing nodes comprise; a plurality of data nodes organized as a plurality of sets, each set comprising nodes of the plurality of data nodes storing a replication of a partition of the plurality of partitions; and a plurality of master nodes, each master node storing a replication of configuration information, the configuration information identifying the data nodes in each of the plurality of sets and a partition of the plurality of partitions replicated on the nodes of each of the plurality of sets. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A computer-readable storage medium comprising computer-executable instructions that, when executed by a computer system, perform a method, the method comprising:
-
identifying the computer system as a primary node for a master partition; deleting any existing data for a global partition map; receiving a plurality of messages from at least a subset of a federation of nodes, each message generated by a node among the subset and indicating for said node a partition replicated on said node, a configuration version of the partition, and data nodes for the partition; identifying a quorum of data nodes for a first partition, the data nodes forming said quorum each having a same configuration version for said first partition; and updating the global partition map to indicate, for said first partition, the configuration version of the first partition and the data nodes for said first partition. - View Dependent Claims (17, 18, 19, 20)
-
Specification