Fault tolerance in a distributed file system
First Claim
1. A computer-readable storage having instructions thereon, which if performed by a computer, cause the computer to at least:
- replace one or more database nodes, in a distributed file system, with a new database node using metadata stored in a network storage in response to detecting one or more failures in the one or more database nodes, wherein the new database node is a data management node within the distributed file system including a plurality of data storage nodes, and wherein the metadata is associated with file data on the plurality of data storage nodes; and
attach the network storage to the data management node so that the data management node can access the metadata.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for providing fault tolerance in a distributed file system of a service provider may include launching at least one data storage node on at least a first virtual machine instance (VMI) running on one or more servers of the service provider and storing file data. At least one data management node may be launched on at least a second VMI running on the one or more servers of the service provider. The at least second VMI may be associated with a dedicated IP address and the at least one data management node may store metadata information associated with the file data in a network storage attached to the at least second VMI. Upon detecting a failure of the at least second VMI, the at least one data management node may be re-launched on at least a third VMI running on the one or more servers.
-
Citations
16 Claims
-
1. A computer-readable storage having instructions thereon, which if performed by a computer, cause the computer to at least:
-
replace one or more database nodes, in a distributed file system, with a new database node using metadata stored in a network storage in response to detecting one or more failures in the one or more database nodes, wherein the new database node is a data management node within the distributed file system including a plurality of data storage nodes, and wherein the metadata is associated with file data on the plurality of data storage nodes; and attach the network storage to the data management node so that the data management node can access the metadata. - View Dependent Claims (2, 3)
-
-
4. A method, comprising:
-
detecting a failure on a database node amongst a plurality of database nodes; replacing the failed database node with a new database node using metadata stored in a network storage; wherein the metadata comprises; information identifying one or more file blocks of a file; and information identifying how many times the one or more file blocks are replicated. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system, comprising:
-
a plurality of host server computers coupled together through a network to form a distributed file system, wherein files are divided into blocks with multiple copies of blocks stored across multiple nodes; and a fault tolerance server operable to; replace one or more database nodes with a new database node using metadata stored in a network storage within the distributed file system in response to detecting one or more failures in the one or more database nodes; wherein the metadata comprises; information identifying one or more file blocks of the files; and information identifying how many times the one or more file blocks are replicated. - View Dependent Claims (14, 15, 16)
-
Specification