Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner
First Claim
1. A distributed data storage system for allowing one or more client systems to access data, comprising:
- a plurality of independent storage units for storing the data;
wherein the data is stored on the plurality of storage units in files, wherein each file includes segments of data and redundancy information for each segment, wherein each segment has an identifier, and wherein the redundancy information for each segment includes at least one copy of the segment, and wherein, for each file, the segments and the redundancy information for each segment are distributed among the plurality of storage units;
wherein each storage unit comprises means for maintaining information associating the identifier of each segment stored on the storage unit with the location of each segment on the storage unit;
wherein the distributed data storage system includes means for maintaining information associating the identifier of each segment with indications of the storage units from the plurality of storage units on which each segment and the redundancy information for the segment is stored;
wherein the distributed data storage system includes means for identifying one of the storage units to be removed; and
wherein the distributed data storage system includes means, operative in response to an identification of one of the storage units to be removed, for redistributing data on the identified storage unit to other storage units, includingmeans for determining, for each segment of data stored on the identified storage unit, another storage unit on which the segment is stored;
means for sending, for each segment of data stored on the identified storage unit, a request to the other storage unit on which the segment is stored to send a copy of the segment to a different storage unit, wherein each request includes the identifier of the segment.
6 Assignments
0 Petitions
Accused Products
Abstract
Multiple applications request data from multiple storage units over a computer network. The data is divided into segments and each segment is distributed randomly on one of several storage units, independent of the storage units on which other segments of the media data are stored. At least one additional copy of each segment also is distributed randomly over the storage units, such that each segment is stored on at least two storage units. This random distribution of multiple copies of segments of data improves both scalability and reliability. When an application requests a selected segment of data, the request is processed by the storage unit with the shortest queue of requests. Random fluctuations in the load applied by multiple applications on multiple storage units are balanced nearly equally over all of the storage units. This combination of techniques results in a system which can transfer multiple, independent high-bandwidth streams of data in a scalable manner in both directions between multiple applications and multiple storage units.
-
Citations
7 Claims
-
1. A distributed data storage system for allowing one or more client systems to access data, comprising:
-
a plurality of independent storage units for storing the data; wherein the data is stored on the plurality of storage units in files, wherein each file includes segments of data and redundancy information for each segment, wherein each segment has an identifier, and wherein the redundancy information for each segment includes at least one copy of the segment, and wherein, for each file, the segments and the redundancy information for each segment are distributed among the plurality of storage units; wherein each storage unit comprises means for maintaining information associating the identifier of each segment stored on the storage unit with the location of each segment on the storage unit; wherein the distributed data storage system includes means for maintaining information associating the identifier of each segment with indications of the storage units from the plurality of storage units on which each segment and the redundancy information for the segment is stored; wherein the distributed data storage system includes means for identifying one of the storage units to be removed; and wherein the distributed data storage system includes means, operative in response to an identification of one of the storage units to be removed, for redistributing data on the identified storage unit to other storage units, including means for determining, for each segment of data stored on the identified storage unit, another storage unit on which the segment is stored; means for sending, for each segment of data stored on the identified storage unit, a request to the other storage unit on which the segment is stored to send a copy of the segment to a different storage unit, wherein each request includes the identifier of the segment. - View Dependent Claims (2)
-
-
3. A distributed data storage system for allowing one or more client systems to access data, comprising:
-
a plurality of independent storage units for storing the data; wherein the data is stored on the plurality of storage units in files, wherein each file includes segments of data and redundancy information for each segment, wherein each segment has an identifier, and wherein the redundancy information for each segment includes at least one copy of the segment, and wherein, for each file, the segments and the redundancy information for each segment are distributed among the plurality of storage units; wherein each storage unit maintains information associating the identifier of each segment stored on the storage unit with the location of each segment on the storage unit; wherein the distributed data storage system maintains information associating the identifier of each segment with indications of the storage units from the plurality of storage units on which each segment and the redundancy information for the segment is stored; wherein the distributed data storage system includes means for redistributing data from an identified storage unit to other storage units, including means for sending, for each segment of data stored on the identified storage unit, a request to store a copy of the segment to a selected one of the other storage units, wherein each request includes the identifier of the segment. - View Dependent Claims (4, 5, 6, 7)
-
Specification