Distributed file system using consensus nodes
First Claim
1. A cluster of nodes comprising computing devices configured to implement a distributed file system, comprising:
- a plurality of data nodes, each data node configured to store data blocks of client files;
at least two active namenodes, each active namenode coupled to the plurality of data nodes and said each active namenode configured to store a state of a same namespace of the cluster, the namespace including at least a location within said each data node of the plurality of data nodes of each data block of the client files across the cluster, said each active namenode of the at least two active namenodes being configured to respond to any request from any client of the distributed file system to generate a new data block or enable a new data block to be stored on any of the data nodes while at least one other of the active namenodes is responding to any other request from the same or any other client to generate another new data block or enable another new data block to be stored on any of the data nodes, only one of the active namenodes being configured to periodically issue a block replicator heartbeat message and to respond to requests from any client to replicate or delete data blocks from the data nodes;
persistent storage memory, coupled to the at least two active namenodes, configured to store at least one journal containing updates to the namespace of the cluster; and
a coordination engine coupled to said each active namenode of the active namenodes, the coordination engine being configured to receive proposals from the active namenodes to change the state of the namespace by at least one of replicating, deleting and adding data blocks in at least one of the plurality of data nodes and to generate, in response, an ordered set of agreements that specifies an order in which the active namenodes are to change the state of the namespace, wherein the active namenodes are configured to delay making changes to the state of the namespace and updating the journal in the persistent storage memory until the ordered set of agreements is received from the coordination engine,wherein failure of the active namenode configured to enable replication and deletion of data blocks to periodically issue the block replicator heartbeat triggers an election process to elect a new active namenode that is to be solely configured to enable replication and deletion of data blocks.
3 Assignments
0 Petitions
Accused Products
Abstract
A cluster of nodes in a distributed file system may include; at least two namenodes, each coupled to a plurality of data nodes and each configured to store a state of a namespace of the cluster and each being configured to respond to a request from a client while other(s) of the namenodes are responding to other requests from other clients; and a coordination engine coupled to each of the namenodes. The coordination engine may be configured to receive proposals from the namenodes to change the state of the namespace by replicating, deleting and/or adding data blocks stored in the data nodes and to generate, in response, an ordered set of agreements that specifies an order in which the namenodes are to change the state of the namespace. The namenodes are configured to delay making changes thereto until after the ordered set of agreements is received from the coordination engine.
83 Citations
21 Claims
-
1. A cluster of nodes comprising computing devices configured to implement a distributed file system, comprising:
-
a plurality of data nodes, each data node configured to store data blocks of client files; at least two active namenodes, each active namenode coupled to the plurality of data nodes and said each active namenode configured to store a state of a same namespace of the cluster, the namespace including at least a location within said each data node of the plurality of data nodes of each data block of the client files across the cluster, said each active namenode of the at least two active namenodes being configured to respond to any request from any client of the distributed file system to generate a new data block or enable a new data block to be stored on any of the data nodes while at least one other of the active namenodes is responding to any other request from the same or any other client to generate another new data block or enable another new data block to be stored on any of the data nodes, only one of the active namenodes being configured to periodically issue a block replicator heartbeat message and to respond to requests from any client to replicate or delete data blocks from the data nodes; persistent storage memory, coupled to the at least two active namenodes, configured to store at least one journal containing updates to the namespace of the cluster; and a coordination engine coupled to said each active namenode of the active namenodes, the coordination engine being configured to receive proposals from the active namenodes to change the state of the namespace by at least one of replicating, deleting and adding data blocks in at least one of the plurality of data nodes and to generate, in response, an ordered set of agreements that specifies an order in which the active namenodes are to change the state of the namespace, wherein the active namenodes are configured to delay making changes to the state of the namespace and updating the journal in the persistent storage memory until the ordered set of agreements is received from the coordination engine, wherein failure of the active namenode configured to enable replication and deletion of data blocks to periodically issue the block replicator heartbeat triggers an election process to elect a new active namenode that is to be solely configured to enable replication and deletion of data blocks. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21)
-
-
20. A computing system configured as a part of a distributed file system, comprising:
-
a first persistent storage memory and a second persistent storage; a first active namenode coupled to a plurality of data nodes, each data node of the plurality of data nodes is configured to store data blocks of client files, the first active namenode being configured to store a state of a same namespace of a cluster of computing nodes and to store updates to the namespace in the first persistent storage memory, the namespace including at least a location within said each data node of the plurality of data nodes of each data block of the client files across the cluster, the first active namenode being configured to be coupled to a second active namenode that is also coupled to the plurality of data nodes and that is also configured to store the state of the namespace and to store updates to the namespace in the second persistent storage, the first active namenode being further configured to respond to any request from any client of the distributed file system to generate a new data block or enable a new data block to be stored on any of the data nodes while the second active namenode is responding to any other request from the same or any other client to generate another new data block or enable another new data block to be stored on any of the data nodes, only one of the first and second active namenodes being configured to periodically issue a block replicator heartbeat message and to respond to requests from any client to replicate or delete data blocks from the data nodes; and a first coordination engine coupled to the first active namenode, the coordination engine being configured to receive proposals from the first active namenode to change the state of the namespace and to generate, in response, an ordered set of agreements that specifies an order in which the first and the second active namenodes are to change the state of the namespace, wherein the first active namenode is configured to delay making changes to the state of the namespace until the ordered set of agreements is received from the coordination engine, and wherein failure of the active namenode configured to respond to requests to replicate or delete data blocks to periodically issue the block replicator heartbeat triggers an election process to elect a new active namenode that is to be solely configured to enable replication and deletion of data blocks.
-
Specification