STORAGE-NETWORK DE-DUPLICATION
First Claim
1. A system comprising:
- one or more processors;
a de-duplicated repository coupled to the one or more processors; and
de-duplication logic coupled to the one or more processors and to the de-duplicated repository, wherein the de-duplicated logic is operable to store files using a single storage encoding and to;
receive, from a client device over a network, a first request to store a file in the de-duplicated repository, wherein the first request includes an identifier of the file and a set of signatures that respectively identify a set of chunks from the file;
look up the set of signatures in the de-duplicated repository to determine whether any chunks in the set of chunks are not stored in the de-duplicated repository;
request, from the client device, those chunks from the set of chunks that are not stored in the de-duplicated repository;
for each chunk from the set of chunks that is not stored in the de-duplicated repository, store in the de-duplicated repository using the single storage encoding at least the chunk and a signature, from the set of signatures, that represents the chunk; and
store, in the de-duplicated repository, a file entry that represents the file and that associates the set of signatures with the identifier of the file.
3 Assignments
0 Petitions
Accused Products
Abstract
Techniques are provided for de-duplication of data. In one embodiment, a system comprises de-duplication logic that is coupled to a de-duplication repository. The de-duplication logic is operable to receive, from a client device over a network, a request to store a file in the de-duplicated repository using a single storage encoding. The request includes a file identifier and a set of signatures that identify a set of chunks from the file. The de-duplication logic determines whether any chunks in the set are missing from the de-duplicated repository and requests the missing chunks from the client device. Then, for each missing chunk, the de-duplication logic stores in the de-duplicated repository that chunk and a signature representing that chunk. The de-duplication logic also stores, in the de-duplicated repository, a file entry that represents the file and that associates the set of signatures with the file identifier.
-
Citations
30 Claims
-
1. A system comprising:
-
one or more processors; a de-duplicated repository coupled to the one or more processors; and de-duplication logic coupled to the one or more processors and to the de-duplicated repository, wherein the de-duplicated logic is operable to store files using a single storage encoding and to; receive, from a client device over a network, a first request to store a file in the de-duplicated repository, wherein the first request includes an identifier of the file and a set of signatures that respectively identify a set of chunks from the file; look up the set of signatures in the de-duplicated repository to determine whether any chunks in the set of chunks are not stored in the de-duplicated repository; request, from the client device, those chunks from the set of chunks that are not stored in the de-duplicated repository; for each chunk from the set of chunks that is not stored in the de-duplicated repository, store in the de-duplicated repository using the single storage encoding at least the chunk and a signature, from the set of signatures, that represents the chunk; and store, in the de-duplicated repository, a file entry that represents the file and that associates the set of signatures with the identifier of the file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. One or more storage media storing instructions which, when executed by one or more computing devices, cause performance of steps comprising:
-
receiving, from a client device over a network, a first request to store a file in the de-duplicated repository using a single storage encoding, wherein the first request includes an identifier of the file and a set of signatures that respectively identify a set of chunks from the file; looking up the set of signatures in the de-duplicated repository to determine whether any chunks in the set of chunks are not stored in the de-duplicated repository; requesting, from the client device, those chunks from the set of chunks that are not stored in the de-duplicated repository; for each chunk from the set of chunks that is not stored in the de-duplicated repository, storing in the de-duplicated repository using the single storage encoding at least the chunk and a signature, from the set of signatures, that represents the chunk; and storing, in the de-duplicated repository, a file entry that represents the file and that associates the set of signatures with the identifier of the file. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification