Scheme for information dispersal and reconstruction
First Claim
1. A process for protecting against loss and for enhancing accessibility in storage or memory, or protecting against loss and enhancing speed of transmission on communication paths, of information which is represented in storage or memory, or represented as data signals on communication paths, said information comprising data elements, each said data element representable as an element of a finite field or computational structure, said process comprisingtransforming and breaking apart said information among a set of smaller assemblages of elements of said finite field or computational structure, anddispersing said information by transmitting said smaller assemblages in the form of said data signals carried on multiple communication paths, or by transmitting said smaller assemblages at different times in the form of said data signals carried on a single communication path, or by storing said assemblages in multiple storage or memory locations, in a manner that is expected to yield no fewer than a required number of said assemblages,said step of transforming and breaking apart comprisingrepresenting said information by N elements of said field or computational structure,grouping said N elements into N/m groups, each containing m elements, where N is greater than m, and hence there are more than one of said groups,selecting a set of vectors {ai :
- i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,generating a set of elements {Ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into n said assemblages, each element ci,r being generated as the product of a vector ai times the elements of the r-th said group of elements, andassembling the n assemblages such that the i-th assemblage comprises the elements {Ci,r ;
r=1, . . . , N/m}.
0 Assignments
0 Petitions
Accused Products
Abstract
Method and apparatus applicable to a variety of data storage, data communication, and parallel computing applications, efficiently improving information availability and load balancing. Information to be transmitted in a data signal or stored is represented as N elements of a field or computational structure, and dispersed among a set of n pieces that are to be transmitted or stored in a manner yielding no fewer than m pieces used in subsequent reconstruction.
For dispersal, n vectors ai each having m elements are used and the n pieces are assembled from elements obtained as products of these vectors with m element groups taken from the N elements representing the information. For reconstruction from m available pieces, m m-element vectors αi are derived from the vectors ai, and the N elements representing the information are obtained as products of these vectors with m-element groups taken from the pieces.
The vector products may be implemented using an appropriate processor, including a vector processor, systolic array, or parallel processor.
For fault-tolerant storage in a partitioned or distributed system, information is dispersed into n pieces so that any m suffice for reconstruction, and the pieces are stored in different parts of the medium.
For fault-tolerant and congestion-free transmission of packets in a network or a parallel computer, each packet is dispersed into n pieces so that any m pieces suffice for reconstruction and the pieces are routed to the packet'"'"'s destination along independent paths or at different times.
1656 Citations
30 Claims
-
1. A process for protecting against loss and for enhancing accessibility in storage or memory, or protecting against loss and enhancing speed of transmission on communication paths, of information which is represented in storage or memory, or represented as data signals on communication paths, said information comprising data elements, each said data element representable as an element of a finite field or computational structure, said process comprising
transforming and breaking apart said information among a set of smaller assemblages of elements of said finite field or computational structure, and dispersing said information by transmitting said smaller assemblages in the form of said data signals carried on multiple communication paths, or by transmitting said smaller assemblages at different times in the form of said data signals carried on a single communication path, or by storing said assemblages in multiple storage or memory locations, in a manner that is expected to yield no fewer than a required number of said assemblages, said step of transforming and breaking apart comprising representing said information by N elements of said field or computational structure, grouping said N elements into N/m groups, each containing m elements, where N is greater than m, and hence there are more than one of said groups, selecting a set of vectors {ai : - i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
generating a set of elements {Ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into n said assemblages, each element ci,r being generated as the product of a vector ai times the elements of the r-th said group of elements, andassembling the n assemblages such that the i-th assemblage comprises the elements {Ci,r ;
r=1, . . . , N/m}. - View Dependent Claims (2, 7, 8, 9, 10, 11, 12, 17)
- i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
-
3. A process for protecting against loss and enhancing accessibility in a partitioned or distributed storage or memory system of the kind having portions or components prone to faults or inaccessibilities, of information which is represented in said storage or memory system, said information comprising data elements, each said data element representable as an element of a finite field or computational structure, said process comprising
transforming and breaking apart said information into smaller assemblages of elements of said finite field or computational structure, and dispersing said information by storing said assemblages in multiple portions or components of said storage or memory system, in a manner that is expected to yield no fewer than a required number, m, of said assemblages, at least some said assemblages being stored in different said portions or components, retrieving at least said required number of said assemblages from said storage or memory system, and reconstructing said information from said retrieved assemblages, said step of transforming and breaking apart comprising representing said information by N elements of said field or computational structure, grouping said N elements into N/m groups, each containing m elements, where N is greater than m, and hence there are more than one of said groups, selecting a set of vectors {ai : - i=1, . . . ,n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
generating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . ,N/m} for assembly into n said assemblages, each element Ci,r being generated as the product of a vector ai times the elements of the rth said group of elements, andassembling the n assemblages such that the ith assemblage comprises the elements {ci,r, r=1, . . . , N/m); said step of reconstructing said information from said retrieved assemblages comprising; forming m vectors {α
i ;
i=1, . . . , m} derived from the m vectors ai that were used to form the elements c.sub.i,r of the retrieved assemblages, andgenerating the N elements which represent the information by multiplying said vectors α
i times vectors of length m formed by taking corresponding single elements, one from each of said retrieved assemblages. - View Dependent Claims (13)
- i=1, . . . ,n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
-
4. A process for protecting against loss and enhancing speed of transmission on communication paths, or information which is represented as data signals on communication paths, said information comprising data elements, each said data element representable as an element of a finite field or computational structure, said communication links being prone to faults or congestions, said process comprising
transforming and breaking apart said information into n smaller assemblages of elements of said finite field or computational structure, and dispersing said information by transmitting said smaller assemblages in the form of said data signals carried on multiple communication paths, or by transmitting said smaller assemblages at different times in the form of said data signals carried on a single communication path, in a manner that is expected to yield no fewer than a required number, m, of said assemblages, receiving at least said required number of said assemblages from said paths, and reconstructing said information from said retrieved assemblages, said step of transforming and breaking apart comprising: -
representing said information by N elements of said field or computational structure, grouping said N elements into N/m groups, each containing m elements, where m N is greater than m, and hence there are more than one of said groups, selecting a set of vectors {ai ;
i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,generating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . ,N/m} for assembly into n said assemblages, each element ci,r being generated as the product of a vector ai times the elements of the rth said group of elements, andassembling the n assemblages such that the ith assemblage comprises the elements {Ci,r ;
r=1, . . . , N/m};said step of reconstructing said information from said received assemblages comprising; forming m vectors {α
i ;
i=1, . . . , m} derived from the m vectors ai that were used to form the elements ci,r of the received assemblages, and,generating the N elements which represent the information by multiplying said vectors α
i times vectors of length m formed by taking corresponding single elements, one from each of said received assemblages. - View Dependent Claims (14)
-
-
5. A process for protecting against loss and for enhancing accessibility in storage or memory, and protecting against loss and enhancing speed of transmission in a network, of information which is represented in storage or memory, and represented as data signals on said network, said network and storage or memory being part of a distributed processing system of the kind having nodes associated with storage or memory devices, said nodes being interconnected in said network, and in which said storage or memory devices are prone to faults or inaccessibilities, said process comprising
transforming and breaking apart said information into smaller assemblages of elements of said finite field or computational structure, and dispersing said information by transmitting said smaller assemblages in the form of said data signals carried on multiple communication paths, or by transmitting said smaller assemblages at different times in the form of said data signals carried on a single communication path, and by storing said assemblages in multiple said storage or memory devices in said nodes, in a manner that is expected to yield no fewer than a required number, m, of assemblages, retrieving at least said required number of assemblages from said storage or memory devices via said network, and reconstructing said information from said retrieved assemblages, and reassembling the information from the thus generated elements, said step of transforming and breaking apart comprising: -
representing said information by N elements of said field or computational structure, grouping said N elements into N/m groups, each containing m elements, where N is greater than m, and hence there are more than one of said groups, selecting a set of vectors {ai ;
i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,generating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into n said assemblages, each element ci,r being generated as the product of a vector α
i times the elements of the rth said group of elements, andassembling the n assemblages such that the ith assemblage comprises the elements {ci,r ;
r=1, . . . , N/m};said step of reconstructing said information from said retrieved assemblages comprising; forming m vectors {ai ;
i=1, . . . , m} derived from the m vectors ai that were used to calculate the elements ci,r of the m retrieved assemblages, andgenerating the N elements which represent the information by multiplying said vectors α
i times vectors of length m formed by taking corresponding single elements, one from each of said retrieved assemblages. - View Dependent Claims (15)
-
-
6. A method for communicating information packets among the nodes of a parallel processing computer, each node having an associated processor, memory, information dispersal/reconstruction unit, and a buffer, said nodes being linked by an interconnection network, whereby each node can communicate with any other node via paths at least some of which pass through intermediate nodes, said paths being subject to failure, and each said node'"'"'s buffer having a finite capacity, each information packet being sent from a source node to a destination node, said information packet comprising data elements, each said data element representable as an element of a finite field or computational structure, said method comprising
transforming and breaking apart said information packet into n assemblages by representing said information packet by N elements of said field or computational structure, grouping said N elements into N/m groups, each containing m elements, where N is greater than m, and hence there are more than one of said groups, selecting a set of vectors {ai : - i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
generating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into n said assemblages, each element ci,r being generated as the product of a vector ai times the elements of the r-th said group of elements, andassembling the n pieces such that the i-th assemblage comprises the elements {ci,r ;
r=1, . . . , N/m},sending said assemblages via different said paths of said network toward said destination node, receiving at said destination node at least said m assemblages from network, and reconstructing said information packet from said m assemblages by forming m vectors {α
i ;
i=1, . . . , m} derived from the m vectors ai that Were used to form the elements ci,r of the available m assemblages,generating the N elements which represent the information by multiplying said vectors α
i times vectors of length m formed by taking corresponding single elements, one from each of the m available assemblages, andreassembling the information from the thus generated N elements. - View Dependent Claims (16)
- i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
-
18. Apparatus for for protecting against loss and for enhancing accessibility in storage or memory, or protecting against loss and enhancing speed of transmission on communication paths, of information which is represented in storage or memory, or represented as data signals on communication paths, said information comprising data elements, each said data element representable as an element of a finite field or computational structure, said apparatus comprising
means for transforming and breaking apart said information among a set of smaller assemblages of elements of said finite field or computational structure and a disperser for dispersing said information by transmitting said smaller assemblages in the form of said data signals carried on multiple communication paths, or by transmitting said smaller assemblages at different times in the form of said data signals carried on a single communication path, or by storing said assemblages in multiple storage or memory locations, in a manner that is expected to yield no fewer than a required number of said assemblages, said means for transforming and breaking apart comprising a converter for representing said information by N elements of said field or computational structure, a segmenter for grouping said N elements into N/m groups, each containing m elements, where N is greater than m, and hence there are more than one of said groups, a vector generator for selecting a set of vectors {ai : - i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
a calculator for calculating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into n said assemblages, each element ci,r being calculated as the product of a vector ai times the elements of the rth said group of elements, andan assembler for assembling the n assemblages such that the ith assemblage comprises the elements {ci,r ;
r=1, . . . , N/m}. - View Dependent Claims (20, 21, 22)
- i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
-
19. Apparatus for reconstructing information from an available set of m assemblages that were generated by dispersing the information, by,
representing said information packet by N elements of a field or computational structure, grouping said N elements into N/m groups, each containing m elements, where N is greater than m, and hence there are more than one of said groups, selecting a set of vectors {ai : - i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
generating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into n said assemblages, each element ci,r being generated as the product of a vector ai times the elements of the r-th said group of elements, andassembling the n pieces such that the i-th assemblage comprises the elements {ci,r ;
r=1, . . . , N/m}, the apparatus comprisinga matrix processor for forming an m×
m matrix whose rows are the m vectors ai that were used to form the characters ci,r of the available m assemblages, and for inverting the m×
m matrix,a calculator for generating the N characters of the data signal by multiplying the inverted matrix times groups of the elements ci,r, and an assembler for reassembling the N characters to form the information.
- i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
-
23. Apparatus for protecting against loss and enhancing accessibility in a partitioned or distributed storage or memory system of the kind having portions or components prone to faults or inaccessibilities, of information which is represented in said storage or memory system., said information comprising data elements, each said data element representable as an element of a finite field or computational structure, said apparatus comprising:
-
means for transforming and breaking apart said information into smaller assemblages of elements of said finite field or computational structure, storage or memory for storing said assemblages in said portions or components of said storage or memory system, at least some pieces being stored in different said portions or components, said means for transforming and breaking apart comprising a converter for representing said information by N elements of a field or computational structure, a segmenter for grouping said N elements into N/m groups of m elements, where m is less than N, a vector generator for selecting a set of vectors {ai ;
i=1, . . . , n} each having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,a calculator for calculating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into said n assemblages, each element ci,r being calculated as the product of a vector ai times the elements of the rth said group of elements, andan assembler for generating the n assemblages such that the ith assemblage comprises the elements {ci,r ;
r=1, . . . , N/m}; anda reconstructor for retrieving at least m of said assemblages from portions or components of said storage or memory system, and for reconstructing said information from said m retrieved assemblages, said reconstructor comprising a processor for forming m vectors {α
i ;
i=1, . . . , m} derived from the m vectors ai that were used to calculate the elements ci,r of the m retrieved assemblages,a calculator for generating the N elements which represent the information by multiplying said vectors α
i times vectors of length m formed by taking corresponding single elements, one from each of said m retrieved assemblages, andan assembler for reassembling the information from the thus generated N elements. - View Dependent Claims (24)
-
-
25. Apparatus for a process for protecting against loss and enhancing speed of transmission on communication paths, of information which is represented as data signals on communication paths, said information comprising data elements, each said data element representable as an element of a finite field or computational element, said communication links being prone to faults or congestions, said apparatus comprising
a disperser for dispersing said information into n smaller assemblages, and for transmitting said n assemblages in the form of said data signals carried on multiple communication paths, or by transmitting said smaller assemblages at different times in the form of said data signals carried on a single communication path, in a manner that is expected to yield no fewer than a required number, m, of said assemblages, said disperser comprising a converter for representing said information by N elements of a field or computational structure, a segmenter for grouping said N elements into N/m groups of m elements, where m is less than N, a vector generator for selecting a set of vectors {ai : - i=1, . . . , n} each having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent, a calculator for calculating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into said n assemblages, each element ci,r being calculated as the product of a vector ai times the elements of the rth said group of elements, and an assembler for generating the n assemblages such that the ith assemblage comprises the elements {ci,r ;
r=1, . . . , N/m}; anda reconstructor for receiving at least m of said assemblages from said, paths and for reconstructing said information from said m received paths, said reconstructor comprising a processor for forming m vectors {α
i ;
i=1, . . . , m} derived from the m vectors ai that were used to calculate the elements ci,r of the m received assemblages, a calculator for generating the N elements which represent of the information by multiplying said vectors α
i times vectors of length m formed by taking corresponding single elements, one from each of said m received assemblages, and an assembler for reassembling the information from the thus generated N elements. - View Dependent Claims (26)
- i=1, . . . , n} each having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent, a calculator for calculating a set of elements {ci,r ;
-
27. Apparatus for protecting against loss and for enhancing accessibility in storage or memory, and protecting against loss and enhancing speed of transmission in a network, or information which is represented in storage or memory and represented as data signals on said network, said network and storage or memory being part of a distributed processing system of the kind having nodes with associated storage or memory devices, said nodes being interconnected in said network, and in which said storage or memory devices are prone to faults or inaccessibilities, said apparatus comprising
a disperser for dispersing said information into n smaller assemblages, and for distributing said n assemblages to said storage or memory devices at different said nodes via said network, said disperser comprising a converter for representing said information by N elements of a field or computational structure, a segmenter for grouping said N elements into N/m groups of m elements, where m is less than N, a vector generator for selecting a set of vectors {ai : - i=1, . . . , n} each having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
a calculator for calculating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into said n assemblages, each element ci,r being calculated as the product of a vector ai times the elements of the rth said group of elements, and an assembler for generating the n assemblages such that the ith piece is an assemblage of the elements {ci,r ;
r=1, . . . , N/m}; anda reconstructor for retrieving at least m of said assemblages from said storage or memory devices via said network, and for reconstructing said information from said m retrieved assemblages, said reconstructor comprising a processor for forming m vectors {α
i ;
i=1, . . . , m} derived from the m vectors ai that were used to calculate the elements ci,r of the m retrieved assemblages,a calculator for generating the N elements which represent the information by multiplying said vectors α
i times vectors of length m formed by taking corresponding single elements, one from each of said m retrieved assemblages, andan assembler for reassembling the information from the thus generated N elements. - View Dependent Claims (28)
- i=1, . . . , n} each having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
-
29. Apparatus for communicating information packets among the nodes of a parallel processing computer, each node having an associated processor and memory, said nodes being linked by an interconnection network, whereby each node can communicate with any other node via paths at least some of which pass through intermediate nodes, said paths being subject to failure, and each said node'"'"'s memory having a finite capacity, each information packet being sent from a source node to a destination node, said information packet comprising data elements, each said data element representable as an element of a finite field or computational structure, said apparatus comprising
a disperser for transforming and breaking apart said information packet into n assemblages, and for sending said assemblages via different said paths of said network toward said destination node the transforming and breaking apart being done in accordance with the following method, by representing said information packet by N elements of said field or computational structure, grouping said N elements into N/m groups, each containing m elements, where N is greater than m, and hence there are more than one of said groups, selecting a set of vectors {ai : - i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
generating a set of elements {ci,r ;
i=1, . . . , n, r=1, . . . , N/m} for assembly into n said assemblages, each element ci,r being generated as the product of a vector ai times the elements of the r-th said group of elements, andassembling the n pieces such that the i-th assemblage comprises the elements {ci,r ;
r=1, . . . , N/m}, anda reconstructor for receiving at said destination node at least said m assemblages from said network, and for reconstructing said information packet from said m assemblages by forming m vectors {α
i ;
i=1, . . . , m} derived from the m vectors ai that were used to form the elements ci,r of the available m assemblages,generating the N elements which represent the information by multiplying said vectors α
i times vectors of length m formed by taking corresponding single elements, one from each of the m available assemblages, andreassembling the information from the thus generated N elements. - View Dependent Claims (30)
- i=1, . . . , n}, each of said vectors having m elements of said field or computational structure, there being an arbitrarily high probability that any subset of m of the vectors are linearly independent,
Specification