Map-Reduce Ready Distributed File System
First Claim
1. A system comprising:
- a plurality of containers comprising file system objects, the plurality of containers stored in a plurality of cluster nodes;
a plurality of replicated containers, each replicated container in the plurality of replicated containers comprising a copy of a container in the plurality of containers, the replicated container stored on a first cluster node in the plurality of cluster nodes different from a second cluster node in the plurality of cluster nodes storing the container; and
a replication chain associated with each container in the plurality of containers, the replication chain comprising a master container and a slave container, the master container being an initial container in the replication chain, wherein the replication chain for the container is changed if a cluster node holding the replicated container is taken out of service, or if the cluster node holding the replicated container returns to service.
7 Assignments
0 Petitions
Accused Products
Abstract
A map-reduce compatible distributed file system that consists of successive component layers that each provide the basis on which the next layer is built provides transactional read-write-update semantics with file chunk replication and huge file-create rates. Containers provide the fundamental basis for data replication, relocation, and transactional updates. A container location database allows containers to be found among all file servers, as well as defining precedence among replicas of containers to organize transactional updates of container contents. Volumes facilitate control of data placement, creation of snapshots and mirrors, and retention of a variety of control and policy information. Also addressed is the use of distributed transactions in a map-reduce system; the use of local and distributed snapshots; replication, including techniques for reconciling the divergence of replicated data after a crash; and mirroring.
19 Citations
20 Claims
-
1. A system comprising:
-
a plurality of containers comprising file system objects, the plurality of containers stored in a plurality of cluster nodes; a plurality of replicated containers, each replicated container in the plurality of replicated containers comprising a copy of a container in the plurality of containers, the replicated container stored on a first cluster node in the plurality of cluster nodes different from a second cluster node in the plurality of cluster nodes storing the container; and a replication chain associated with each container in the plurality of containers, the replication chain comprising a master container and a slave container, the master container being an initial container in the replication chain, wherein the replication chain for the container is changed if a cluster node holding the replicated container is taken out of service, or if the cluster node holding the replicated container returns to service. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method comprising:
-
configuring a plurality of containers to include file system objects, the plurality of containers stored in a plurality of cluster nodes; configuring a plurality of replicated containers, wherein each replicated container in the plurality of replicated containers includes a copy of a container in the plurality of containers, the replicated container stored on a first cluster node in the plurality of cluster nodes different from a second cluster node in the plurality of cluster nodes storing the container; and configuring a replication chain associated with each container in the plurality of containers to include a master container and a slave container, the master container being an initial container in the replication chain, wherein the replication chain for the container is changed if a cluster node holding the replicated container is taken out of service, or if the cluster node holding the replicated container returns to service. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification