Storage-network de-duplication
First Claim
1. A method comprising:
- receiving a request to store a file in a de-duplicated repository of a management system, wherein the de-duplicated repository is stored on physical disk blocks that have a fixed size, and wherein the request includes an identifier of the file and a set of signatures that respectively identify a set of chunks from the file;
identifying a first chunk, from the set of chunks, that is not stored in the de-duplicated repository;
storing the first chunk and a first signature, from the set of signatures, that represents the first chunk in the de-duplicated repository, wherein the first chunk is generated as a variable-sized chunk based on the fixed size of the physical disk blocks such that each variable-sized chunk is no larger than the fixed size of the physical disk blocks; and
storing a second chunk and a second signature, from the set of signatures, that represents the second chunk in the de-duplicated repository, wherein the second chunk is generated such that a combined size of the first chunk and the second chunk is no larger than the fixed size of the physical disk blocks.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are provided for de-duplication of data. In one embodiment, a system comprises de-duplication logic that is coupled to a de-duplication repository. The de-duplication logic is operable to receive, from a client device over a network, a request to store a file in the de-duplicated repository using a single storage encoding. The request includes a file identifier and a set of signatures that identify a set of chunks from the file. The de-duplication logic determines whether any chunks in the set are missing from the de-duplicated repository and requests the missing chunks from the client device. Then, for each missing chunk, the de-duplication logic stores in the de-duplicated repository that chunk and a signature representing that chunk. The de-duplication logic also stores, in the de-duplicated repository, a file entry that represents the file and that associates the set of signatures with the file identifier.
44 Citations
20 Claims
-
1. A method comprising:
-
receiving a request to store a file in a de-duplicated repository of a management system, wherein the de-duplicated repository is stored on physical disk blocks that have a fixed size, and wherein the request includes an identifier of the file and a set of signatures that respectively identify a set of chunks from the file; identifying a first chunk, from the set of chunks, that is not stored in the de-duplicated repository; storing the first chunk and a first signature, from the set of signatures, that represents the first chunk in the de-duplicated repository, wherein the first chunk is generated as a variable-sized chunk based on the fixed size of the physical disk blocks such that each variable-sized chunk is no larger than the fixed size of the physical disk blocks; and storing a second chunk and a second signature, from the set of signatures, that represents the second chunk in the de-duplicated repository, wherein the second chunk is generated such that a combined size of the first chunk and the second chunk is no larger than the fixed size of the physical disk blocks. - View Dependent Claims (2, 3, 4, 5, 6, 18)
-
-
7. A system comprising:
-
a de-duplicated repository stored on physical disk blocks that have a fixed size; and a processor programmed to; receiving a request to store a file in the de-duplicated repository, the request including an identifier of the file and a set of signatures that respectively identify a set of chunks from the file; identify a first chunk, from the set of chunks, that is not stored in the de-duplicated repository; store the first chunk and a first signature, from the set of signatures, that represents the first chunk in the de-duplicated repository, wherein the first chunk is generated as a variable-sized chunk based on the fixed size of the physical disk blocks such that each variable-sized chunk is no larger than the fixed size of the physical disk blocks; and store a second chunk and a second signature, from the set of signatures, that represents the second chunk in the de-duplicated repository, wherein the second chunk is generated such that a combined size of the first chunk and the second chunk is no larger than the fixed size of the physical disk blocks. - View Dependent Claims (8, 9, 10, 11, 19)
-
-
12. One or more non-transitory storage media storing instructions that when executed by one or more processors, cause the one or more processors to:
-
receive a request to store a file in a de-duplicated repository of a management system, wherein the de-duplicated repository is stored on physical disk blocks that have a fixed size, and wherein the request includes an identifier of the file and a set of signatures that respectively identify a set of chunks from the file; identify a first chunk, from the set of chunks, that is not stored in the de-duplicated repository; store the first chunk and a first signature, from the set of signatures, that represents the first chunk in the de-duplicated repository, wherein the first chunk is generated as a variable-sized chunk based on the fixed size of the physical disk blocks such that each variable-sized chunk is no larger than the fixed size of the physical disk blocks; and store a second chunk and a second signature, from the set of signatures, that represents the second chunk in the de-duplicated repository, wherein the second chunk is generated such that a combined size of the first chunk and the second chunk is no larger than the fixed size of the physical disk blocks. - View Dependent Claims (13, 14, 15, 16, 17, 20)
-
Specification