Map-reduce ready distributed file system
First Claim
1. A map-reduce compatible distributed file system, comprising:
- a plurality of containers in which each container stores file and directory meta-data as well as file content data;
wherein references to file content data are stored on a subset of nodes on which container meta-data and data are stored; and
wherein container data and meta-data are arranged to allow a topological sort to imply update order;
a container location database (CLDB) configured to maintain information about where each of said plurality of containers is located;
a plurality of cluster nodes, each cluster node containing one or more storage pools, each storage pool containing zero or more containers; and
a plurality of inodes for structuring data within said containers;
wherein said CLDB is configured to assign nodes as replicas of data in a container to meet policy constraints in accordance with any of the following;
said CLDB assigns each container a master node that controls all transactions for that container;
said CLDB designates a chain of nodes to hold replicas;
when one of the replicas goes down or is separated from the master CLDB node, it is removed from the replication chain;
when the master goes down or is separated, a new master is designated;
any node that comes back after having been removed from the replication chain is reinserted at the end of the replication chain when the chain still needs another replication chain when the node returns;
when the node returns within a first predetermined interval, no new node to replicate the container in question has been designated and the chain still needs a replication chain; and
when the node has been gone for a second, longer predetermined interval, the CLDB may designate some other node to take a place in the chain.
7 Assignments
0 Petitions
Accused Products
Abstract
A map-reduce compatible distributed file system that consists of successive component layers that each provide the basis on which the next layer is built provides transactional read-write-update semantics with file chunk replication and huge file-create rates. A primitive storage layer (storage pools) knits together raw block stores and provides a storage mechanism for containers and transaction logs. Storage pools are manipulated by individual file servers. Containers provide the fundamental basis for data replication, relocation, and transactional updates. A container location database allows containers to be found among all file servers, as well as defining precedence among replicas of containers to organize transactional updates of container contents. Volumes facilitate control of data placement, creation of snapshots and mirrors, and retention of a variety of control and policy information. Key-value stores relate keys to data for such purposes as directories, container location maps, and offset maps in compressed files.
-
Citations
19 Claims
-
1. A map-reduce compatible distributed file system, comprising:
-
a plurality of containers in which each container stores file and directory meta-data as well as file content data; wherein references to file content data are stored on a subset of nodes on which container meta-data and data are stored; and wherein container data and meta-data are arranged to allow a topological sort to imply update order; a container location database (CLDB) configured to maintain information about where each of said plurality of containers is located; a plurality of cluster nodes, each cluster node containing one or more storage pools, each storage pool containing zero or more containers; and
a plurality of inodes for structuring data within said containers;wherein said CLDB is configured to assign nodes as replicas of data in a container to meet policy constraints in accordance with any of the following; said CLDB assigns each container a master node that controls all transactions for that container; said CLDB designates a chain of nodes to hold replicas; when one of the replicas goes down or is separated from the master CLDB node, it is removed from the replication chain; when the master goes down or is separated, a new master is designated; any node that comes back after having been removed from the replication chain is reinserted at the end of the replication chain when the chain still needs another replication chain when the node returns; when the node returns within a first predetermined interval, no new node to replicate the container in question has been designated and the chain still needs a replication chain; and when the node has been gone for a second, longer predetermined interval, the CLDB may designate some other node to take a place in the chain. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
Specification