Array-based distributed storage system with parity
First Claim
Patent Images
1. An array-based distributed storage system, comprising:
- a plurality of clients that each include a communication interface,a plurality of storage servers that each include a communication interface,a computer network interconnecting the plurality of clients and the plurality of storage servers through their respective communication interfaces;
storage for storing a map defining, for each write request for a data block, which of the plurality of storage servers is a data storage server and which of the plurality of storage servers is a parity server;
wherein a client determines, for a write request for a particular data block, the data storage server for the particular data block in accordance with the map and transmits the write request for the particular data block to the determined data storage server; and
wherein each storage server comprises;
selection logic operative to enable data storage logic and relaying logic if the selection logic determines, in accordance with the map, that a particular received data block is to be stored on the storage server and operative to enable parity logic if the selection logic determines, in accordance with the map, that the particular received data block is to be used to generate a parity block to be stored on the storage server;
wherein the data storage logic is operative to store the particular received data block at the storage server in response to a determination by the selection logic that the particular received data block is to be stored on the storage server, andwherein the parity logic is operative to generate and store on the storage server a parity block using the particular received data block in response to a determination by the selection logic that the particular received data block is to be used to generate a parity block to be stored on the storage server; and
wherein the relaying logic is operative to relay a copy of the particular received data block to the parity server for the particular received data block in accordance with the map in response to a determination by the selection logic that the particular received data block is to be stored on the storage server.
10 Assignments
0 Petitions
Accused Products
Abstract
In one general aspect, a data access method is disclosed that includes directing data block write requests from different clients to different data storage servers based on a map. Data blocks referenced in the data block write requests are stored in the data storage servers. Data from the data write requests are also relayed to a parity server, and parity information is derived and stored for the blocks. This method can reduce the need for inter-server communication, and can be scaled across an arbitrary number of servers. It can also employ parity load distribution to improve the performance of file transfers.
51 Citations
15 Claims
-
1. An array-based distributed storage system, comprising:
-
a plurality of clients that each include a communication interface, a plurality of storage servers that each include a communication interface, a computer network interconnecting the plurality of clients and the plurality of storage servers through their respective communication interfaces; storage for storing a map defining, for each write request for a data block, which of the plurality of storage servers is a data storage server and which of the plurality of storage servers is a parity server; wherein a client determines, for a write request for a particular data block, the data storage server for the particular data block in accordance with the map and transmits the write request for the particular data block to the determined data storage server; and wherein each storage server comprises; selection logic operative to enable data storage logic and relaying logic if the selection logic determines, in accordance with the map, that a particular received data block is to be stored on the storage server and operative to enable parity logic if the selection logic determines, in accordance with the map, that the particular received data block is to be used to generate a parity block to be stored on the storage server; wherein the data storage logic is operative to store the particular received data block at the storage server in response to a determination by the selection logic that the particular received data block is to be stored on the storage server, and wherein the parity logic is operative to generate and store on the storage server a parity block using the particular received data block in response to a determination by the selection logic that the particular received data block is to be used to generate a parity block to be stored on the storage server; and wherein the relaying logic is operative to relay a copy of the particular received data block to the parity server for the particular received data block in accordance with the map in response to a determination by the selection logic that the particular received data block is to be stored on the storage server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An array-based distributed storage system, comprising:
-
a plurality of clients that each include a communication interface, a plurality of storage servers that each include a communication interface, a computer network interconnecting the plurality of clients and the plurality of storage servers through their respective communication interfaces; storage for storing a map defining, for each data block in a group of data blocks, which of the plurality of storage servers is a data storage server for the data block and, for the group of data blocks, which of the plurality of storage servers is a parity server for parity data for the group of data blocks, wherein each of the plurality of storage servers acts as a data storage server and as a parity server for different groups of data blocks; wherein a client, when storing a particular data block on the plurality of storage servers, determines the data storage server for the particular data block in accordance with the map and transmits the particular data block to the determined data storage server; and wherein each storage server receives data blocks from the clients through the computer network and comprises; selection logic that enables data storage logic on the storage server if the selection logic determines according to the map, that the storage server is the data storage server for storing the particular received data block, and wherein the selection logic enables parity logic on the storage server if the selection logic determines, according to the map, that the storage server is the parity server for storing parity data for the group of data blocks including the particular received data block; wherein the data storage logic, when enabled by the selection logic, stores the particular received data block on the storage server and relays a copy of the particular received data block to the parity server for the group of data blocks that includes the particular received data block in accordance with the map; and wherein the parity logic, when enabled by the selection logic, generates parity data using the particular received data block and stores the parity data on the storage server. - View Dependent Claims (13)
-
-
14. In an array-based distributed storage system, comprising a plurality of clients that each include a communication interface, a plurality of storage servers that each include a communication interface, and a computer network interconnecting the plurality of clients and the plurality of storage servers through their respective communication interfaces, and storage for storing a map defining, for each data block in a group of data blocks, which of the plurality of storage servers is a data storage server for the data block and, for the group of data blocks, which of the plurality of storage servers is a parity server for parity data for the group of data blocks, wherein each of the plurality of storage servers acts as a data storage server and as a parity server for different groups of data blocks, wherein a storage server comprises data storage logic and parity logic, a process for storing a group of data blocks comprising a parity group, the process comprising:
-
a client, when storing the parity group on the storage servers, determining the storage server for each data block in accordance with the map; the client transmitting each data block to the determined storage server for the data block the storage servers receiving data blocks through the computer network; each storage server determining, for a received data block and according to the map, an action to be performed by the storage server for the particular received data block; each storage server, when determining that the received data block is to be stored, enabling the data storage logic of the storage server to store the data block on the storage server and relaying a copy of the data block to the parity server assigned to the group of data blocks by the map; and the storage server, when determining that a received data block is one of the received data blocks from the other storage servers to be used in parity calculation, enabling the parity logic of the storage server to compute and store a parity block for the group of data blocks.
-
-
15. An array-based distributed storage system, comprising:
-
a plurality of clients that each include a communication interface, a plurality of storage servers that each include a communication interface, a computer network interconnecting the plurality of clients and the plurality of storage servers through their respective communication interfaces; storage for storing a map defining, for each data block of a data file, a first storage server, from among the plurality of storage servers, which stores the data block and a second storage server, from among the plurality of storage servers, which stores parity data derived using the data block; wherein one of the plurality of clients, when storing a data file, determines, for each write request for each data block in the data file, the first storage server for the data block in accordance with the map and transmits the write request for the data block to the determined first storage server for the data block; and wherein each storage server, comprises; an input that receives data blocks from clients and other storage servers; selection logic; wherein the selection logic enables data storage logic and relaying logic if the selection logic determines that the storage server is the first storage server for the received data block, wherein the data storage logic, when enabled by the selection logic, stores the received data block on the storage server; and wherein the relaying logic, when enabled by the selection logic, relays a copy of the received data block to the second storage server for the received data block; wherein the selection logic enables parity logic if the selection logic determines that the storage server is the second storage server for the received data block; wherein the parity logic, when enabled by the selection logic, generates and stores on the storage server a parity block using the received data block.
-
Specification