Systems and methods for reliably storing data using liquid distributed storage
First Claim
1. A method for repair of source data comprising one or more source objects stored as multiple fragments distributed across multiple storage nodes of a storage system, wherein one or more fragments of the multiple fragments includes redundant data for the one or more source objects, the method comprising:
- determining that at least one fragment of the multiple fragments is missing from the storage system for a source object of the one or more source objects for which there is no corresponding object instance in a repair queue;
adding a corresponding object instance to the repair queue for the source object, wherein the repair queue includes object instances for a plurality of source objects having at least one fragment missing from the storage system; and
performing repair processing according to a lazy repair policy, wherein repair operation according to the lazy repair policy allows object instances to accumulate in the repair queue for performing repairs at an average repair rate, R, wherein the average repair rate, R, is selected such that the performing repairs at the average repair rate, R, results in processing source objects associated with the queued object instances to complete before a loss rate of fragments results in fewer than k fragments being available in the storage system for any source object of the one or more source objects, wherein k is a number of source fragments per source object.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments provide methodologies for reliably storing data within a storage system using liquid distributed storage control. Such liquid distributed storage control operates to compress repair bandwidth utilized within a storage system for data repair processing to the point of operating in a liquid regime. Liquid distributed storage control logic of embodiments may employ a lazy repair policy, repair bandwidth control, a large erasure code, and/or a repair queue. Embodiments of liquid distributed storage control logic may additionally or alternatively implement a data organization adapted to allow the repair policy to avoid handling large objects, instead streaming data into the storage nodes at a very fine granularity.
36 Citations
59 Claims
-
1. A method for repair of source data comprising one or more source objects stored as multiple fragments distributed across multiple storage nodes of a storage system, wherein one or more fragments of the multiple fragments includes redundant data for the one or more source objects, the method comprising:
-
determining that at least one fragment of the multiple fragments is missing from the storage system for a source object of the one or more source objects for which there is no corresponding object instance in a repair queue; adding a corresponding object instance to the repair queue for the source object, wherein the repair queue includes object instances for a plurality of source objects having at least one fragment missing from the storage system; and performing repair processing according to a lazy repair policy, wherein repair operation according to the lazy repair policy allows object instances to accumulate in the repair queue for performing repairs at an average repair rate, R, wherein the average repair rate, R, is selected such that the performing repairs at the average repair rate, R, results in processing source objects associated with the queued object instances to complete before a loss rate of fragments results in fewer than k fragments being available in the storage system for any source object of the one or more source objects, wherein k is a number of source fragments per source object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. An apparatus for repair of source data comprising one or more source objects stored as multiple fragments distributed across multiple storage nodes of a storage system, wherein one or more fragments of the multiple fragments includes redundant data for the one or more source objects, the apparatus comprising:
-
one or more data processors; and one or more non-transitory computer-readable storage media containing program code configured to cause the one or more data processors to perform operations including; determining that at least one fragment of the multiple fragments is missing from the storage system for a source object of the one or more source objects for which there is no corresponding object instance in a repair queue; adding a corresponding object instance to the repair queue for the source object, wherein the repair queue includes object instances for a plurality of source objects having at least one fragment missing from the storage system; and performing repair processing according to a lazy repair policy, wherein repair operation according to the lazy repair policy allows object instances to accumulate in the repair queue for performing repairs at an average repair rate, R, wherein the average repair rate, R, is selected such that the performing repairs at the average repair rate, R, results in processing source objects associated with the queued object instances to complete before a loss rate of fragments results in fewer than k fragments being available in the storage system for any source object of the one or more source objects, wherein k is a number of source fragments per source object. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45)
-
-
46. An apparatus for repair of source data comprising one or more source objects stored as multiple fragments distributed across multiple storage nodes of a storage system, wherein one or more fragments of the multiple fragments includes redundant data for the one or more source objects, the apparatus comprising:
-
means for determining that at least one fragment of the multiple fragments is missing from the storage system for a source object of the one or more source objects for which there is no corresponding object instance in a repair queue; means for adding a corresponding object instance to the repair queue for the source object, wherein the repair queue includes object instances for a plurality of source objects having at least one fragment missing from the storage system; and means for performing repair processing according to a lazy repair policy, wherein repair operation according to the lazy repair policy allows object instances to accumulate in the repair queue for performing repairs at an average repair rate, R, wherein the average repair rate, R, is selected such that the performing repairs at the average repair rate, R, results in processing source objects associated with the queued object instances to complete before a loss rate of fragments results in fewer than k fragments being available in the storage system for any source object of the one or more source objects, wherein k is an number of source fragments per source object. - View Dependent Claims (47, 48, 49, 50, 51, 52)
-
-
53. A non-transitory computer-readable medium comprising codes for repair of source data comprising one or more source objects stored as multiple fragments distributed across multiple storage nodes of a storage system, wherein one or more fragments of the multiple fragments includes redundant data for the one or more source objects, the codes causing a computer to:
-
determine that at least one fragment of the multiple fragments is missing from the storage system for a source object of the one or more source objects for which there is no corresponding object instance in a repair queue; add a corresponding object instance to the repair queue for the source object, wherein the repair queue includes object instances for a plurality of source objects having at least one fragment missing from the storage system; and perform repair processing according to a lazy repair policy, wherein repair operation according to the lazy repair policy allows object instances to accumulate in the repair queue for performing repairs at an average repair rate, R, wherein the average repair rate, R, is selected such that the performing repairs at the average repair rate, R, results in processing source objects associated with the queued object instances to complete before a loss rate of fragments results in fewer than k fragments being available in the storage system for any source object of the one or more source objects, wherein k is an number of source fragments per source object. - View Dependent Claims (54, 55, 56, 57, 58, 59)
-
Specification