Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner
First Claim
1. A computer implemented process for managing data storage in a system comprising a plurality of clients, each client having a file system through which applications executed on the client systems access data stored in files of the file system, the system further comprising a plurality of independent storage servers, each operating independently of the clients and without central control, and a computer network interconnecting the plurality of independent storage servers and the plurality of clients wherein data of each file stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers, such that each segment is stored on at least two of the storage servers, comprising:
- maintaining information for each file indicating the storage servers on which the segments of data of the file are stored;
before reading data from a file, accessing, using the client, the information for the file indicating the storage servers on which the segments of the file are stored ,wherein the client uses said information to communicate directly with the storage servers to request the segments of the file; and
after writing data to a file, accessing the storage to update the information for the file indicating the storage servers on which the segments of the file are stored.
6 Assignments
0 Petitions
Accused Products
Abstract
Multiple applications request data from multiple storage units over a computer network. The data is divided into segments and each segment is distributed randomly on one of several storage units, independent of the storage units on which other segments of the media data are stored. At least one additional copy of each segment also is distributed randomly over the storage units, such that each segment is stored on at least two storage units. This random distribution of multiple copies of segments of data improves both scalability and reliability. When an application requests a selected segment of data, the request is processed by the storage unit with the shortest queue of requests. Random fluctuations in the load applied by multiple applications on multiple storage units are balanced nearly equally over all of the storage units. This combination of techniques results in a system which can transfer multiple, independent high-bandwidth streams of data in a scalable manner in both directions between multiple applications and multiple storage units.
-
Citations
39 Claims
-
1. A computer implemented process for managing data storage in a system comprising a plurality of clients, each client having a file system through which applications executed on the client systems access data stored in files of the file system, the system further comprising a plurality of independent storage servers, each operating independently of the clients and without central control, and a computer network interconnecting the plurality of independent storage servers and the plurality of clients wherein data of each file stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers, such that each segment is stored on at least two of the storage servers, comprising:
-
maintaining information for each file indicating the storage servers on which the segments of data of the file are stored; before reading data from a file, accessing, using the client, the information for the file indicating the storage servers on which the segments of the file are stored ,wherein the client uses said information to communicate directly with the storage servers to request the segments of the file; and after writing data to a file, accessing the storage to update the information for the file indicating the storage servers on which the segments of the file are stored. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 32, 33, 34, 35, 36, 37, 38, 39)
-
-
20. A data storage system, comprising:
-
a plurality of client systems, each having a file system through which applications executed on the client systems access data stored in files of the file system; a plurality of independent storage servers, each operating independently of the client systems and without central control; a computer network interconnecting the plurality of independent storage servers and the plurality of client systems; wherein data of each file stored on the plurality of independent storage servers is divided into segments, with two or more copies of each segment being distributed among the plurality of independent storage servers, such that each segment is stored on at least two of the storage servers; storage maintaining information for each file indicating the storage servers on which the segments of data of the file are stored; wherein, before reading data from a file, a client system accesses the storage to obtain the information for the file indicating the storage servers on which the segments of the file are stored, and the client system uses said information to communicate directly with the storage servers to request the segments of the file; and wherein, after writing data to a file, the storage is accessed to update the information for the file indicating the storage servers on which the segments of the file are stored. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
Specification