Distributed data storage
First Claim
1. A method for writing data to a data storage system, the method being employed in a server that accesses data in the data storage system via a communication network, the method comprising:
- the server sending a multicast storage query to a plurality of data storage nodes of the data storage system, the multicast storage query indicating the server is requesting identification of data storage nodes that can store a data file;
the server receiving a plurality of responses from a subset of the plurality of data storage nodes, each response comprising geographic data relating to a geographic position of the data storage node that sent the response;
the server selecting at least two data storage nodes in the subset based on the geographic data included in the responses, wherein the selection ensures that there is at least a requisite level of geographical diversity between the at least two data storage nodes;
the server sending a data file and a data identifier that corresponds to the data file to the at least two selected data storage nodes, wherein the server sends a host list for the data file that indicates which data storage nodes are being configured to store the data file;
the server receiving acknowledgement information from one or more of the at least two selected data storage nodes, wherein the server determines from the acknowledgement information that a first data storage node of the at least two selected data storage nodes successfully stored the data file and that at least a second data storage node of the at least two selected data storage nodes did not successfully store the data file; and
the server sending an indication that a data replication procedure should be initiated based on determining that at least the second data storage node of the at least two selected data storage nodes did not successfully store the data file, wherein the indication is sent to at least the first data storage node in order to trigger at least the first data storage node that did successfully store the data file to perform the data replication procedure for the at least one file within the data storage system such that the data file is replicated to at least a third data storage node in the data storage system.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates to a distributed data storage system comprising a plurality of storage nodes. Using unicast and multicast transmission, a server application may read and write data in the storage system. Each storage node may monitor reading and writing operations on the system as well as the status of other storage nodes. In this way, the storage nodes may detect a need for replication of files on the system, and may carry out a replication process that serves to maintain a storage of a sufficient number of copies of files with correct versions at geographically diverse locations.
-
Citations
16 Claims
-
1. A method for writing data to a data storage system, the method being employed in a server that accesses data in the data storage system via a communication network, the method comprising:
-
the server sending a multicast storage query to a plurality of data storage nodes of the data storage system, the multicast storage query indicating the server is requesting identification of data storage nodes that can store a data file; the server receiving a plurality of responses from a subset of the plurality of data storage nodes, each response comprising geographic data relating to a geographic position of the data storage node that sent the response; the server selecting at least two data storage nodes in the subset based on the geographic data included in the responses, wherein the selection ensures that there is at least a requisite level of geographical diversity between the at least two data storage nodes; the server sending a data file and a data identifier that corresponds to the data file to the at least two selected data storage nodes, wherein the server sends a host list for the data file that indicates which data storage nodes are being configured to store the data file; the server receiving acknowledgement information from one or more of the at least two selected data storage nodes, wherein the server determines from the acknowledgement information that a first data storage node of the at least two selected data storage nodes successfully stored the data file and that at least a second data storage node of the at least two selected data storage nodes did not successfully store the data file; and the server sending an indication that a data replication procedure should be initiated based on determining that at least the second data storage node of the at least two selected data storage nodes did not successfully store the data file, wherein the indication is sent to at least the first data storage node in order to trigger at least the first data storage node that did successfully store the data file to perform the data replication procedure for the at least one file within the data storage system such that the data file is replicated to at least a third data storage node in the data storage system. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A server adapted for writing data to a data storage system comprising a plurality of data storage nodes, the server comprising a computer readable data storage medium comprising instructions that when executed result in at least:
-
the server sending a multicast storage query to a plurality of data storage nodes of the data storage system, the multicast storage query indicating the server is requesting identification of data storage nodes that can store a data file; the server receiving a plurality of responses from a subset of the plurality of data storage nodes, each response comprising geographic data relating to a geographic position of the data storage node that sent the response; the server selecting at least two data storage nodes in the subset based on the geographic data included in the responses, wherein the selection ensures that there is at least a requisite level of geographical diversity between the at least two data storage nodes; the server sending a data file and a data identifier that corresponds to the data file to the at least two selected data storage nodes, wherein the server sends a host list for the data file that indicates which data storage nodes are being configured to store the data file; the server receiving acknowledgement information from one or more of the at least two selected data storage nodes, wherein the server determines from the acknowledgement information that a first data storage node of the at least two selected data storage nodes successfully stored the data file and that at least a second data storage node of the at least two selected data storage nodes did not successfully store the data file; the server sending an indication that a data replication procedure should be initiated based on determining that at least the second data storage node of the at least two selected data storage nodes did not successfully store the data file, wherein the indication is sent to at least the first data storage node in order to trigger at least the first data storage node that did successfully store the data file to perform the data replication procedure for the at least one file within the data storage system such that the data file is replicated to at least a third data storage node in the data storage system. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A method for a device to update data in a data storage system, the device corresponding to a server accessing the data storage system or a data storage node in the data storage system, the method comprising:
-
the device sending a multicast storage query to a plurality of data storage nodes in the data storage system, wherein the multicast storage query includes a data identifier associated with a data file, and the multicast storage query indicates that the device is requesting a response for data storage nodes that stored a copy of the data file associated with the data identifier; the device receiving responses to the multicast storage query from a subset of the plurality of data storage nodes, wherein each of the responses comprise a respective host list for the data file, the received host lists indicating data storage nodes in the data storage system that store copies of the data file; the device sending a message comprising an updated version of the data file to the subset of data storage nodes that responded to the multicast storage query; the device receiving acknowledgement information from one or more storage nodes in the subset, wherein the device determines from the acknowledgement information that at least a first data storage node in the subset successfully updated the data file and that at least a second data storage node in the subset did not successfully update the data file; and the device sending an indication that a data replication procedure should be initiated based on determining that at least the second data storage node did not successfully update the data file, wherein the indication is sent to at least the first data storage node in order to trigger at least the first data storage node that did successfully update the file to perform the data replication procedure for the at least one file within the data storage system such that the data file is replicated to at least a third data storage node in the data storage system. - View Dependent Claims (16)
-
Specification