Peer to peer code generator and decoder for digital systems
First Claim
1. A method of storing data in a clustered data processing system, comprising the steps of:
- receiving, by a plurality of available peers in the system, a set of more than one input symbol segment for a first codeword, each of the plurality of available peers receiving all of the input symbol segments in the set;
in response to the step of receiving, a first group of at least two of the available peers each storing a respective one of the input symbol segments for the first codeword, none of the peers in the first group retaining for the first codeword all of the input symbol segments for the first codeword; and
in response to the step of receiving, a second group of the available peers each generating and storing a respective checksum symbol segment for the first codeword, based on the input symbol segments for the first codeword, each of the checksum symbol segments generated by the second group of peers for the first codeword having contents that, in conjunction with a first subset of fewer than all of the input symbol segments for the first codeword, is sufficient to recover one of the input symbol segments for the first codeword which is not in the first subset.
1 Assignment
0 Petitions
Accused Products
Abstract
Digital content from a source (e.g., a file or a stream), is striped and encoded in parallel over a cluster of Storage Systems. The encoding ensures that subsequent retrieval of the data succeeds even when some members of the cluster of Storage Systems are lost or when errors in communication result in the loss of some IP packets. Host Map File (HMF) data is produced that describes fully how to retrieve the content, including the encoding parameters, the cluster of Storage Systems and the striping of the encoded data. This HMF data is then inserted as the header of every encoded file on the cluster of Storage Systems. The HMF data is the only way the encoded files can be reassembled into a meaningful whole. The original content is retrieved by requesting its data from the cluster of Storage Systems. In each Storage System, a decoder parses the HMF data and transmits the striped data to the requestor. The decoders cooperate to dynamically detect erasures and to reconstruct the missing data. The system is self-healing as new Storage Systems are able to reconstruct data missing due to the loss of any Storage Systems from the cluster without impeding concurrent encode and decode transactions.
-
Citations
5 Claims
-
1. A method of storing data in a clustered data processing system, comprising the steps of:
-
receiving, by a plurality of available peers in the system, a set of more than one input symbol segment for a first codeword, each of the plurality of available peers receiving all of the input symbol segments in the set; in response to the step of receiving, a first group of at least two of the available peers each storing a respective one of the input symbol segments for the first codeword, none of the peers in the first group retaining for the first codeword all of the input symbol segments for the first codeword; and in response to the step of receiving, a second group of the available peers each generating and storing a respective checksum symbol segment for the first codeword, based on the input symbol segments for the first codeword, each of the checksum symbol segments generated by the second group of peers for the first codeword having contents that, in conjunction with a first subset of fewer than all of the input symbol segments for the first codeword, is sufficient to recover one of the input symbol segments for the first codeword which is not in the first subset.
-
-
2. A method of retrieving data in a clustered data processing system, comprising:
-
receiving, by a plurality of available peers of the system, a content request for delivery of data to a retrieval destination, the content request covering data in a first codeword; in response to the step of receiving, a first group of at least two of the available peers each transmitting toward the retrieval destination a respective input symbol segment stored by the peer, the transmitted input symbol segments also being received by a second group of the available peers different from the retrieval destination; detecting erasure of a first input symbol segment which is covered by the content request; in response to the step of detecting, a first peer in the second group of peers regenerating the first erased input symbol segment in dependence upon a first checksum symbol segment stored by at least one of the available peers and in further dependence upon ones of the input symbol segments transmitted in the step of transmitting; and transmitting the regenerated first input symbol segment toward the retrieval destination.
-
-
3. A method of healing a clustered data processing system having a plurality of peers, comprising the steps of:
-
providing in the clustered data processing system a plurality of stored codewords each having at least one input symbol segment and at least one checksum symbol segment, the codewords being stored in the data processing system such that a respective erased subset of the symbol segments of each of the codewords in the plurality of stored codewords is missing; for each i'"'"'th one of the codewords in the plurality of stored codewords, a respective i'"'"'th regeneration group of at least one of the peers regenerating the erased subset of symbol segments of the i'"'"'th codeword, in dependence upon available ones of the symbol segments of the i'"'"'th codeword; and for each i'"'"'th one of the codewords in the plurality of stored codewords, a respective i'"'"'th healing group of at least one of the peers storing the symbol segments regenerated by the i'"'"'th regeneration group of the peers.
-
-
4. A method of operating a clustered data processing system having a plurality of peers, for use with a plurality of codewords each having at least one input symbol segment and at least one checksum symbol segment, comprising the steps of:
-
receiving a plurality of input symbol segments to store for a first codeword; available ones of a storage group of at least one of the plurality of peers each storing a respective one of the input symbol segments; available ones of a checksum group of at least one of the plurality of peers each generating and storing a respective checksum symbol segment for the first codeword, in dependence upon the received plurality of input symbol segments for the first codeword; receiving a content request from a content requestor covering input symbol segments in the first codeword, an erased subset of at least one input symbol segment covered by the content request being missing from the first codeword as stored in the data processing system; available ones of the storage group of peers each transmitting, at least toward a retrieval destination, input symbol segments stored for the first codeword; a regenerating group of the plurality of peers regenerating the erased subset of input symbol segments in dependence upon the transmitted input symbol segments, and transmitting the regenerated erased subset of input symbol segments at least toward a retrieval destination. - View Dependent Claims (5)
-
Specification