Adaptive Erasure Codes
First Claim
1. A non-transitory computer-readable storage medium storing computer executable instructions that when executed by a computer control the computer to perform a method for storing data in a data storage system, the method comprising:
- accessing a message, where the message has a message size;
selecting an encoding strategy that includes an erasure code approach, where the encoding strategy is selected as a function of the message size, of failure statistics associated with one or more data storage devices in the data storage system, of wear periods associated with one or more data storage devices in the data storage system, of space constraints associated with one or more data storage devices in the data storage system, or of overhead constraints associated with one or more data storage devices in the data storage system;
generating an encoded message from the message using the encoding strategy;
generating an encoded block that includes the encoded message and metadata associated with the message; and
storing the encoded block in the data storage system.
8 Assignments
0 Petitions
Accused Products
Abstract
Methods, apparatus, and other embodiments associated with adaptive use of erasure codes for distributed data storage systems are described. One example method includes accessing a message, where the message has a message size, selecting an encoding strategy as a function of the message size, data storage device failure statistics, data storage device wear periods, data storage space constraints, or overhead constraints, and where the encoding strategy includes an erasure code approach, generating an encoded message using the encoding strategy, generating an encoded block, where the encoded block includes the encoded message and metadata associated with the message, and storing the encoded block in the data storage system. Example methods and apparatus may employ Reed Solomon erasure codes or Fountain erasure codes. Example methods and apparatus may display to a user the storage capacity and durability of the data storage system.
-
Citations
32 Claims
-
1. A non-transitory computer-readable storage medium storing computer executable instructions that when executed by a computer control the computer to perform a method for storing data in a data storage system, the method comprising:
-
accessing a message, where the message has a message size; selecting an encoding strategy that includes an erasure code approach, where the encoding strategy is selected as a function of the message size, of failure statistics associated with one or more data storage devices in the data storage system, of wear periods associated with one or more data storage devices in the data storage system, of space constraints associated with one or more data storage devices in the data storage system, or of overhead constraints associated with one or more data storage devices in the data storage system; generating an encoded message from the message using the encoding strategy; generating an encoded block that includes the encoded message and metadata associated with the message; and storing the encoded block in the data storage system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium storing computer executable instructions that when executed by a computer control the computer to perform a method for encoding a message to be stored in a distributed data storage system, the method comprising:
-
accessing a message, where the message has a message size; upon determining that the message size is greater than or equal to a threshold size; selecting a Fountain coding approach as a selected coding approach; upon determining that the message size is less than the threshold size; computing an interleaving overhead; computing a coding overhead; upon determining that the interleaving overhead is greater than one and the coding overhead is less than zero; selecting a Fountain coding approach as the selected coding approach; upon detecting that the interleaving overhead is not greater than one and the coding overhead is not less than zero; selecting a Reed Solomon coding approach as the selected coding approach; upon detecting that a number of parities is less than a threshold number of parities, where the threshold number of parities is four when a STAR coding approach is employed, and where the threshold number of parities is three when a non-STAR coding approach is employed; using an exclusive or (XOR) based block-maximum distance separable (MDS) coding approach as the selected coding approach; upon detecting that the number of parities is greater than or equal to the threshold number of parities; using a classical construction coding approach as the selected coding approach; upon determining that the data storage system is within a threshold reliability level; automatically and dynamically generating an adapted coding approach by adjusting a set of coding parameters, where the set of coding parameters is based, at least in part, on the selected coding approach; and upon determining that the data storage system is not within the threshold reliability level, encoding the message using the adapted coding approach. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. An apparatus for storing a message in a set of data storage devices, the apparatus comprising:
-
a processor; a memory; a set of logics; and an interface that connects the processor, the memory, and the set of logics; the set of logics comprising; a monitoring logic that monitors a set of operating parameters associated with the set of data storage devices; a strategy selection logic that selects an encoding strategy based on a property of the message or the set of operating parameters, where the encoding strategy includes an encoding technique; an adaptation logic that adapts the encoding strategy as a function of the set of operating parameters; an encoding logic that generates an encoded message by encoding the message using the encoding strategy; and a storage logic that stores the encoded message in the distributed storage system. - View Dependent Claims (27, 28, 29, 30, 31, 32)
-
Specification