Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner
First Claim
1. A data storage system, comprising:
- a plurality of client systems, each client system having a file system through which applications executed on the client system access data stored in files of the file system;
a plurality of independent storage servers, each server storing data from files of the file system;
a computer network interconnecting the plurality of independent storage servers and the plurality of client systems;
wherein each file stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers, such that each segment is stored on at least two of the storage servers;
storage maintaining information indicating the storage servers on which the segments of files are stored;
at least one of the client systems being configured to;
access, before reading data from a file, the storage to obtain the information indicating the storage servers on which the segments of the file are stored, andcommunicate directly with the storage servers to request the segments of the file using the accessed information;
wherein, after writing data to a file, the information indicating the storage servers on which segments of the file are stored is updated in the storage; and
wherein, if a storage server is unavailable, segments of files stored on the unavailable storage server are copied from other storage servers that store the segments of files to other available storage servers.
6 Assignments
0 Petitions
Accused Products
Abstract
Multiple applications request data from multiple storage units over a computer network. The data is divided into segments and each segment is distributed randomly on one of several storage units, independent of the storage units on which other segments of the media data are stored. At least one additional copy of each segment also is distributed randomly over the storage units, such that each segment is stored on at least two storage units. When an application requests a selected segment of data, the request is processed by the storage unit with the shortest queue of requests. Random fluctuations in the load applied by multiple applications on multiple storage units are balanced nearly equally over all storage units. These techniques result in a system which can transfer multiple, independent high-bandwidth streams of data in a scalable and reliable manner in both directions between multiple applications and multiple storage units.
125 Citations
23 Claims
-
1. A data storage system, comprising:
-
a plurality of client systems, each client system having a file system through which applications executed on the client system access data stored in files of the file system; a plurality of independent storage servers, each server storing data from files of the file system; a computer network interconnecting the plurality of independent storage servers and the plurality of client systems; wherein each file stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers, such that each segment is stored on at least two of the storage servers; storage maintaining information indicating the storage servers on which the segments of files are stored; at least one of the client systems being configured to; access, before reading data from a file, the storage to obtain the information indicating the storage servers on which the segments of the file are stored, and communicate directly with the storage servers to request the segments of the file using the accessed information; wherein, after writing data to a file, the information indicating the storage servers on which segments of the file are stored is updated in the storage; and wherein, if a storage server is unavailable, segments of files stored on the unavailable storage server are copied from other storage servers that store the segments of files to other available storage servers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A data storage system, comprising:
-
a plurality of client systems, each client system having a file system through which applications executed on the client system access data stored in files of the file system; a plurality of independent storage servers, each server storing data from files of the file system; a computer network interconnecting the plurality of independent storage servers and the plurality of client systems; wherein each file stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers, such that each segment is stored on at least two of the storage servers; storage maintaining information indicating the storage servers on which the segments of files are stored; at least one of the client systems being configured to; access, before reading data from a file, the storage to obtain the information indicating the storage servers on which the segments of the file are stored, and communicate directly with the storage servers to request the segments of the file using the accessed information; wherein, after writing data to a file, the information indicating the storage servers on which segments of the file are stored is updated in the storage; and wherein, after an additional storage server becomes available in the plurality of independent storage servers, each file that is newly stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers including the additional storage server, such that each segment is stored on at least two of the storage servers. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A data storage system, comprising:
-
a plurality of client systems, each client system having a file system through which applications executed on the client system access data stored in files of the file system; a plurality of independent storage servers, each server storing data from files of the file system; a computer network interconnecting the plurality of independent storage servers and the plurality of client systems; wherein each file stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers, such that each segment is stored on at least two of the storage servers; storage maintaining information indicating the storage servers on which the segments of files are stored; at least one of the client systems being configured to; access, before reading data from a file, the storage to obtain the information indicating the storage servers on which the segments of the file are stored, and communicate directly with the storage servers to request the segments of the file using the accessed information; wherein, after writing data to a file, the information indicating the storage servers on which segments of the file are stored is updated in the storage; and a storage management system configured to cause an additional copy of segments of a file to be distributed among the plurality of independent storage servers and to update the information indicating the storage servers on which segments of the file are stored. - View Dependent Claims (22)
-
-
23. A data storage system, comprising:
-
a plurality of client systems, each client system having a file system through which applications executed on the client system access data stored in files of the file system; a plurality of independent storage servers, each server storing data from files of the file system; a computer network interconnecting the plurality of independent storage servers and the plurality of client systems; wherein each file stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers;
such that each segment is stored on at least two of the storage servers, wherein the copy of each segment is processed separately and asynchronously from the copies of the other segments;storage maintaining information indicating the storage servers on which the segments of files are stored; at least one of the client systems being configured to; access, before reading data from a file, the storage to obtain the information indicating the storage servers on which the segments of the file are stored, and communicate directly with the storage servers to request the segments of the file using the accessed information; wherein, after writing data to a file, the information indicating the storage servers on which segments of the file are stored is updated in the storage.
-
Specification