Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner
First Claim
1. A file system for a computer, enabling the computer to access remote independent storage units over a computer network in response to a request, from an application executed on the computer, to read data stored in a file on the storage units, wherein a file includes segments of the data and corresponding redundancy information for each segment, and wherein, for each file, each segment of the data is stored on a randomly or pseudorandomly selected one of the storage units, and wherein, for each segment of the data, the corresponding redundancy information is stored on a randomly or pseudorandomly selected one of the storage units, the file system comprising:
- means, responsive to the request to read data from a file, for selecting, for each segment of the requested data, one of the storage units on which data representing the segment is stored;
means for reading each segment of the requested data from the selected storage unit for the segment;
means for serializing the segments read from the selected storage units; and
means for providing the serialized data to the application;
wherein the means for selecting includes;
in the file system;
means for requesting data from one of the storage units, indicating an estimated time;
means for requesting data from another of the storage units, indicating an estimated time, if the first storage unit rejects the request; and
means for requesting the data from the first storage unit if the second storage unit rejects the request; and
in each storage unit;
means for rejecting a request for data if the request cannot be serviced by the storage unit within the estimated time; and
means for accepting a request for data if the request can be serviced by the storage unit within the estimated time.
8 Assignments
0 Petitions
Accused Products
Abstract
Multiple applications request data from multiple storage units over a computer network. The data is divided into segments and each segment is distributed randomly on one of several storage units, independent of the storage units on which other segments of the media data are stored. Redundancy information corresponding to each segment also is distributed randomly over the storage units. The redundancy information for a segment may be a copy of the segment, such that each segment is stored on at least two storage units. The redundancy information also may be based on two or more segments. This random distribution of segments of data and corresponding redundancy information improves both scalability and reliability. When a storage unit fails, its load is distributed evenly over to remaining storage units and its lost data may be recovered because of the redundancy information. When an application requests a selected segment of data, the request may be processed by the storage unit with the shortest queue of requests. Random fluctuations in the load applied by multiple applications on multiple storage units are balanced nearly equally over all of the storage units. This combination of techniques results in a system which can transfer multiple, independent high-bandwidth streams of data in a scalable manner in both directions between multiple applications and multiple storage units.
2243 Citations
7 Claims
-
1. A file system for a computer, enabling the computer to access remote independent storage units over a computer network in response to a request, from an application executed on the computer, to read data stored in a file on the storage units, wherein a file includes segments of the data and corresponding redundancy information for each segment, and wherein, for each file, each segment of the data is stored on a randomly or pseudorandomly selected one of the storage units, and wherein, for each segment of the data, the corresponding redundancy information is stored on a randomly or pseudorandomly selected one of the storage units, the file system comprising:
-
means, responsive to the request to read data from a file, for selecting, for each segment of the requested data, one of the storage units on which data representing the segment is stored;
means for reading each segment of the requested data from the selected storage unit for the segment;
means for serializing the segments read from the selected storage units; and
means for providing the serialized data to the application;
wherein the means for selecting includes;
in the file system;
means for requesting data from one of the storage units, indicating an estimated time;
means for requesting data from another of the storage units, indicating an estimated time, if the first storage unit rejects the request; and
means for requesting the data from the first storage unit if the second storage unit rejects the request; and
in each storage unit;
means for rejecting a request for data if the request cannot be serviced by the storage unit within the estimated time; and
means for accepting a request for data if the request can be serviced by the storage unit within the estimated time.
-
-
2. A file system for a computer, enabling the computer to access remote independent storage units over a computer network in response to a request, from an application executed on the computer, to read data stored in a file on the storage units, wherein a file includes segments of the data and corresponding redundancy information for each segment, and wherein, for each file, each segment of the data is stored on a randomly or pseudorandomly selected one of the storage units, and wherein, for each segment of the data, the corresponding redundancy information is stored on a randomly or pseudorandomly selected one of the storage units, the file system comprising:
-
means, responsive to the request to read data from a file, for selecting, for each segment of the requested data, one of the storage units on which data representing the segment is stored;
means for reading each segment of the requested data from the selected storage unit for the segment;
means for serializing the segments read from the selected storage units; and
means for providing the serialized data to the application;
wherein the means for reading each segment comprises means for scheduling the transfer of the data from the selected storage unit such that the storage unit efficiently transfers data, and includes;
in the file system;
means for requesting transfer of the data from the selected storage unit, indicating a waiting time;
means for requesting the data from another storage unit if the selected storage unit rejects the request to transfer the data; and
in the storage unit;
means for rejecting a request to transfer data if the data is not available to be transferred from the storage unit by the indicated waiting time; and
means for transferring the data if the selected storage unit is able to transfer the data within the waiting time.
-
-
3. A file system for a computer, enabling the computer to access remote independent storage units over a computer network in response to a request, from an application executed on the computer, to read data stored on the storage units, wherein segments of the data and corresponding redundancy information are randomly or pseudorandomly distributed among the plurality of storage units, the file system comprising:
-
means, responsive to the request to read data, for selecting, for each segment of the requested data, one of the storage units on which data representing the segment is stored;
means for transferring the data from the selected storage unit if the selected storage unit has the data available within an indicated waiting time;
means for transferring the data from another storage unit if the selected storage unit does not have the data available within the indicated waiting time; and
means for providing the transferred data to the application;
wherein the means for selecting comprises;
means for selecting a first one of the storage units if the first storage unit can transfer the segment within a first estimated time, and for selecting a second one of the storage units if the second one of the storage units can transfer the segment within a second estimated time and if the first storage unit cannot transfer the segment within the first estimated time, and for selecting the first storage unit if the second storage unit cannot transfer the segment within the second estimated time. - View Dependent Claims (4)
-
-
5. A file system for a computer, enabling the computer to access remote independent storage units over a computer network in response to a request, from an application executed on the computer, to read data stored in a file on the storage units, wherein a file includes segments of the data and corresponding redundancy information for each segment, and wherein, for each file, each segment of the data is stored on a randomly or pseudorandomly selected one of the storage units, and wherein, for each segment of the data, the corresponding redundancy information is stored on a randomly or pseudorandomly selected one of the storage units, the file system comprising:
-
means, responsive to the request to read data from a file, for selecting, for each segment of the requested data, one of the storage units on which data representing the segment is stored;
means for reading each segment of the requested data from the selected storage unit for the segment;
means for serializing the segments read from the selected storage units; and
means for providing the serialized data to the application;
wherein the means for selecting comprises;
means for selecting a first one of the storage units if the first storage unit can transfer the segment within a first estimated time, and for selecting a second one of the storage units if the second one of the storage units can transfer the segment within a second estimated time and if the first storage unit cannot transfer the segment within the first estimated time, and for selecting the first storage unit if the second storage unit cannot transfer the segment within the second estimated time.
-
-
6. A file system for a computer, enabling the computer to access remote independent storage units over a computer network in response to a request, from an application executed on the computer, to read data stored on the storage units, wherein segments of the data and corresponding redundancy information are randomly or pseudorandomly distributed among the plurality of storage units, wherein the redundancy information corresponding to a segment is a copy of the segment, the file system comprising:
-
means for selecting for each segment of the requested data one of the storage units on which the segment is stored, by selecting a first one of the storage units if the first storage unit can transfer the segment within a first estimated time, and by selecting a second one of the storage units if the second one of the storage units can transfer the segment within a second estimated time and if the first storage unit cannot transfer the segment within the first estimated time, and selecting the first storage unit if the second storage unit cannot transfer the segment within the second estimated time;
means for reading each segment of the requested data from the selected storage unit for the segment; and
means for providing the data to the application when the data is received from the identified storage units. - View Dependent Claims (7)
-
Specification