Map-Reduce Ready Distributed File System
First Claim
1. A map-reduce compatible distributed file system, comprising:
- a container location database (CLDB) configured to maintain information about where each of a plurality of containers is located;
a plurality of cluster nodes, each cluster node containing one or more storage pools, each storage pool containing zero or more containers; and
a plurality of inodes for structuring data within said containers;
wherein said containers are replicated to other cluster nodes with one container designated as master for each replication chain;
wherein data in the CLDB is itself stored as inodes in well known containers;
wherein said CLDB nodes are configured to maintain a database that contains at least the following information about all of said containers;
nodes that have replicas of a container;
an ordering of a replication chain for each container;
wherein updates to a container are sent to a master for said updated container; and
wherein changes to content of a container are propagated to the replicas of the container by said master.
7 Assignments
0 Petitions
Accused Products
Abstract
A map-reduce compatible distributed file system that consists of successive component layers that each provide the basis on which the next layer is built provides transactional read-write-update semantics with file chunk replication and huge file-create rates. Containers provide the fundamental basis for data replication, relocation, and transactional updates. A container location database allows containers to be found among all file servers, as well as defining precedence among replicas of containers to organize transactional updates of container contents. Volumes facilitate control of data placement, creation of snapshots and mirrors, and retention of a variety of control and policy information. Also addressed is the use of distributed transactions in a map-reduce system; the use of local and distributed snapshots; replication, including techniques for reconciling the divergence of replicated data after a crash; and mirroring.
-
Citations
35 Claims
-
1. A map-reduce compatible distributed file system, comprising:
-
a container location database (CLDB) configured to maintain information about where each of a plurality of containers is located; a plurality of cluster nodes, each cluster node containing one or more storage pools, each storage pool containing zero or more containers; and a plurality of inodes for structuring data within said containers; wherein said containers are replicated to other cluster nodes with one container designated as master for each replication chain; wherein data in the CLDB is itself stored as inodes in well known containers; wherein said CLDB nodes are configured to maintain a database that contains at least the following information about all of said containers; nodes that have replicas of a container; an ordering of a replication chain for each container; wherein updates to a container are sent to a master for said updated container; and wherein changes to content of a container are propagated to the replicas of the container by said master. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
Specification