DATA STORAGE ARCHITECTURE AND SYSTEM FOR HIGH PERFORMANCE COMPUTING
First Claim
1. A data storage method comprising:
- an I/O node receiving a storage request from a computing node of a super computer;
the I/O node computing a hash on metadata for a data item referenced in the storage request to obtain a nonvolatile memory (NVM) location for the data item;
the I/O node creating an entry for the data item with the NVM location in its portion of a distributed hash table;
the I/O node determining if the data item was stored in an expected storage location based on the hash of the metadata of the data item;
if the data item was not stored in the expected storage location,the I/O node communicating the NVM storage location of the data item to the expected I/O node,the expected I/O node creating an entry for the data item in its portion of the distributed hash table,the expected I/O node initiating sending the data item to primary storage,the I/O node removing the entry for the data item from its portion of the distributed hash table, andthe expected I/O node updating its entry for the data item in its portion of the distributed hash table signifying that the data item has been moved to primary storage;
if the data item was stored in the expected storage location,the expected I/O node initiating sending the data item to primary storage,the I/O node updating its portion of the distributed hash table signifying that the data item is available in primary storage.
3 Assignments
0 Petitions
Accused Products
Abstract
Data storage systems and methods for storing data are described herein. The storage system may be integrated with or coupled with a compute cluster or super computer having multiple computing nodes. A plurality of nonvolatile memory units may be included with computing nodes, coupled with computing nodes or coupled with input/output nodes. The input/output nodes may be included with the compute cluster or super computer, or coupled thereto. The nonvolatile memory units store data items provided by the computing nodes, and the input/output nodes maintain where the data items are stored in the nonvolatile memory units via a hash table distributed among the input/output nodes. The use of a distributed hash table allows for quick access to data items stored in the nonvolatile memory units even as the computing nodes are writing large amounts of data to the storage system quickly in bursts.
-
Citations
18 Claims
-
1. A data storage method comprising:
-
an I/O node receiving a storage request from a computing node of a super computer; the I/O node computing a hash on metadata for a data item referenced in the storage request to obtain a nonvolatile memory (NVM) location for the data item; the I/O node creating an entry for the data item with the NVM location in its portion of a distributed hash table; the I/O node determining if the data item was stored in an expected storage location based on the hash of the metadata of the data item; if the data item was not stored in the expected storage location, the I/O node communicating the NVM storage location of the data item to the expected I/O node, the expected I/O node creating an entry for the data item in its portion of the distributed hash table, the expected I/O node initiating sending the data item to primary storage, the I/O node removing the entry for the data item from its portion of the distributed hash table, and the expected I/O node updating its entry for the data item in its portion of the distributed hash table signifying that the data item has been moved to primary storage;
if the data item was stored in the expected storage location,the expected I/O node initiating sending the data item to primary storage, the I/O node updating its portion of the distributed hash table signifying that the data item is available in primary storage. - View Dependent Claims (2, 3)
-
-
4. A compute cluster comprising:
-
a plurality of computing nodes coupled with a high speed interconnect and coupled with a local interconnect; a plurality of nonvolatile memory units coupled with the local interconnect; a plurality of input/output nodes coupled with the high speed interconnect, each input/output node including a portion of a distributed hash table to maintain a data item location of data items provided by the computing nodes indexed according to a hash on metadata for the data items. - View Dependent Claims (5, 6, 7, 8, 9, 10)
-
-
11. A data retrieval method comprising:
-
a computing node of a plurality of computing nodes performing a hash on metadata for a data item; the computing node sending a read request for the data item to an input/output node (I/O node), the read request including the hash result; the I/O node receiving the read request for the data item from the computing node; the I/O node looking for an entry for the data item in its portion of a distributed hash table based on the hash result; the I/O node obtaining the data item from a non-volatile memory location or from a primary storage; the I/O node providing the data item to the computing node. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
Specification