Data deduplication in a dispersed storage system
First Claim
1. A data encoding and compression method for execution by a computing device, wherein the method comprises:
- receiving a storage request regarding storing a portion of a data object in dispersed storage network (DSN) memory;
generating a reference for the portion;
determining that a substantially identical portion is not stored in the DSN memory based on the reference;
when the substantially identical portion is not stored in the DSN memory;
encrypting, using an encryption algorithm, the portion using an encryption key to produce encrypted data, wherein the encryption key is one of;
the portion, an inversion of the portion, or a known inversion and non-inversion bit pattern of the portion, and wherein, based on encryption algorithm, the encrypted data includes one of;
substantially all zeros, substantially all ones, or a known pattern of ones and zeros;
compressing the encrypted data based on a compression function to produce compressed data, wherein the compression data is a representation of one of;
the substantially all zeros of the encrypted data, the substantially all ones of the encrypted data, and the known pattern of ones and zeros of the encrypted data;
storing the compressed data;
dispersed storage error encoding the encryption key to produce a plurality of sets of encoded key slices; and
facilitating storage of the plurality of sets of encoded key slices in the DSN memory.
5 Assignments
0 Petitions
Accused Products
Abstract
An efficient data deduplication method for use in a dispersed storage network (DSN). After a data object is received for storage in the DSN, it is determined whether a substantially identical data object has previously been encrypted and stored. The determination may be made, for example, by comparing an encryption key reference value relating to the data object to key reference information stored in DSN memory. If not detected, the data object is encrypted using an encryption key based on the data object. The encrypted data object is then compressed and stored. The encryption key and a key reference value are also stored as encoded key slices in DSN memory. If the data object was previously stored, it is encrypted using a retrieved encryption key that is substantially identical to the data object. The data object may then be compressed for storage using a pattern based data compression function.
88 Citations
12 Claims
-
1. A data encoding and compression method for execution by a computing device, wherein the method comprises:
-
receiving a storage request regarding storing a portion of a data object in dispersed storage network (DSN) memory; generating a reference for the portion; determining that a substantially identical portion is not stored in the DSN memory based on the reference; when the substantially identical portion is not stored in the DSN memory; encrypting, using an encryption algorithm, the portion using an encryption key to produce encrypted data, wherein the encryption key is one of;
the portion, an inversion of the portion, or a known inversion and non-inversion bit pattern of the portion, and wherein, based on encryption algorithm, the encrypted data includes one of;
substantially all zeros, substantially all ones, or a known pattern of ones and zeros;compressing the encrypted data based on a compression function to produce compressed data, wherein the compression data is a representation of one of;
the substantially all zeros of the encrypted data, the substantially all ones of the encrypted data, and the known pattern of ones and zeros of the encrypted data;storing the compressed data; dispersed storage error encoding the encryption key to produce a plurality of sets of encoded key slices; and facilitating storage of the plurality of sets of encoded key slices in the DSN memory. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer readable storage medium having accessible therefrom a set of instructions interpretable by a processing module, the set of instructions being configured to cause the processing module to carry out operations for:
-
receiving a storage request regarding storing a portion of a data object in dispersed storage network (DSN) memory; generating a reference for the portion; determining that a substantially identical portion is not stored in the DSN memory based on the reference; when the substantially identical portion is not stored in the DSN memory; encrypting, using an encryption algorithm, the portion using an encryption key to produce encrypted data, wherein the encryption key is one of;
the portion, an inversion of the portion, or a known inversion and non-inversion bit pattern of the portion, and wherein, based on encryption algorithm, the encrypted data includes one of;
substantially all zeros, substantially all ones, or a known pattern of ones and zeros;compressing the encrypted data based on a compression function to produce compressed data, wherein the compression data is a representation of one of;
the substantially all zeros of the encrypted data, the substantially all ones of the encrypted data, and the known pattern of ones and zeros of the encrypted data;storing the compressed data; dispersed storage error encoding the encryption key to produce a plurality of sets of encoded key slices; and facilitating storage of the plurality of sets of encoded key slices in the DSN memory. - View Dependent Claims (9, 10, 11, 12)
-
Specification