Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner
First Claim
1. A data storage system, comprising:
- a plurality of client computers, each client computer having a file system through which applications executed on the client computer access data;
a plurality of storage servers coupled to the plurality of client computers via a computer network, each storage server comprising storage configured to store segments of the data accessible through the file systems of the plurality of client computers; and
catalog storage configured to store information indicating storage servers from among the plurality of storage servers on which the segments of the data are stored;
at least one of the client computers being configured to;
receive information indicating storage servers, among the plurality of storage servers, available for storing segments of the data;
the file system of the at least one of the client computers being configured to, in response to a request from an application to write data;
divide the data to be written into a plurality of segments, each segment having an identifier;
for each segment of the plurality of segments, select, using the received information, two of the available storage servers for storing the segment, such that each segment is stored on at least two different storage servers, and the segments are distributed nonsequentially among the plurality of storage servers; and
communicate over the computer network directly with the selected storage servers to transmit the segments of the data to the selected storage servers for storage; and
each storage server being configured to;
receive segments over the computer network from client computers that request segments to be stored;
after receiving a request over the computer network from one of the client computers to store a segment, wherein the received request includes the identifier of the segment, determine a location for the segment in the storage of the storage server; and
store data in the storage of the storage server representing, for each segment stored on the storage server, an association between the identifier for the segment with the location of the segment in the storage of the storage server; and
the catalog manager being further configured to update the information indicating storage servers from among the plurality of storage servers on which the segments of the data are stored based on successful writing of the segments of the data to the plurality of storage servers.
4 Assignments
0 Petitions
Accused Products
Abstract
Multiple applications request data from multiple storage units over a computer network. The data is divided into segments and each segment is distributed randomly on one of several storage units, independent of the storage units on which other segments of the media data are stored. At least one additional copy of each segment also is distributed randomly over the storage units, such that each segment is stored on at least two storage units. This random distribution of multiple copies of segments of data improves both scalability and reliability. When an application requests a selected segment of data, the request is processed by the storage unit with the shortest queue of requests. Random fluctuations in the load applied by multiple applications on multiple storage units are balanced nearly equally over all of the storage units. This combination of techniques results in a system which can transfer multiple, independent high-bandwidth streams of data in a scalable manner in both directions between multiple applications and multiple storage units.
130 Citations
4 Claims
-
1. A data storage system, comprising:
-
a plurality of client computers, each client computer having a file system through which applications executed on the client computer access data; a plurality of storage servers coupled to the plurality of client computers via a computer network, each storage server comprising storage configured to store segments of the data accessible through the file systems of the plurality of client computers; and catalog storage configured to store information indicating storage servers from among the plurality of storage servers on which the segments of the data are stored; at least one of the client computers being configured to; receive information indicating storage servers, among the plurality of storage servers, available for storing segments of the data; the file system of the at least one of the client computers being configured to, in response to a request from an application to write data; divide the data to be written into a plurality of segments, each segment having an identifier; for each segment of the plurality of segments, select, using the received information, two of the available storage servers for storing the segment, such that each segment is stored on at least two different storage servers, and the segments are distributed nonsequentially among the plurality of storage servers; and communicate over the computer network directly with the selected storage servers to transmit the segments of the data to the selected storage servers for storage; and each storage server being configured to; receive segments over the computer network from client computers that request segments to be stored; after receiving a request over the computer network from one of the client computers to store a segment, wherein the received request includes the identifier of the segment, determine a location for the segment in the storage of the storage server; and store data in the storage of the storage server representing, for each segment stored on the storage server, an association between the identifier for the segment with the location of the segment in the storage of the storage server; and the catalog manager being further configured to update the information indicating storage servers from among the plurality of storage servers on which the segments of the data are stored based on successful writing of the segments of the data to the plurality of storage servers. - View Dependent Claims (2, 3, 4)
-
Specification