×

Joint de-duplication-erasure coded distributed storage

  • US 10,853,187 B2
  • Filed: 02/26/2019
  • Issued: 12/01/2020
  • Est. Priority Date: 07/15/2016
  • Status: Active Grant
First Claim
Patent Images

1. A non-transitory computer-readable storage device storing computer executable instructions that when executed by a computer control the computer to perform a method for deduplicating and erasure coding a message, the method comprising:

  • accessing the message,generating a set of message chunks by chunking the message using a first chunking approach;

    generating a set of outer-precoded parity symbols and a set of outer-precoded data symbols from the set of message chunks using an outer precode, where the outer precode is a low density parity check (LDPC) precode that uses an LDPC parity check matrix, where the LDPC parity check matrix has a column weight and a row weight;

    selectively adapting the LDPC parity check matrix by;

    computing a set of chunk statistics associated with a first set of erasure codes;

    generating a worst-case (WC) symbol characterization for a member of the first set of erasure codes based, at least in part, on a WC average number of symbol erasures or a WC average number of symbol errors;

    computing an average number of symbol errors for the first set of erasure codes based, at least in part, on a set of deduplication parameters, the set of chunk statistics, and the worst-case symbol characterization;

    choosing a column weight or a row weight such that a failure probability Pf is less than a decoding threshold;

    upon determining that there has been a change in a chunk pool;

    generating a new LDPC parity check matrix based, at least in part, on the column weight and the row weight; and

    replacing the LDPC parity check matrix with the new LDPC parity check matrix;

    storing the set of outer-precoded parity symbols in a data storage system;

    generating a set of unique data symbols by deduplicating the set of outer-precoded data symbols based, at least in part, on a chunk identification (ID) table;

    storing a copy of the set of unique data symbols in the data storage system;

    generating a set of inner-precoded data symbols from the set of unique data symbols using an inner-precode;

    generating the first set of erasure codes from the set of inner-precoded data symbols; and

    storing the first set of erasure codes in the data storage system.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×