Distributed deduplicated storage system
First Claim
1. A method of performing a storage operation in a distributed deduplicated storage system, comprising:
- creating with a first deduplication node of a plurality of deduplication nodes, a first hash signature of a first data block of a plurality of data blocks associated with a file, and a first header that at least identifies a first media agent that stored a copy of the first data block;
creating with a second deduplication node of the plurality of deduplication nodes, a second hash signature of at least a second data block associated with the file, and a second header that at least identifies a second media agent that stored a copy of the second data block;
storing at the first deduplication node a copy of the second hash signature, and a copy of the second header created by the second deduplication node;
receiving a first request from a client computing device to restore the file comprising the plurality of data blocks;
determining with the first deduplication node that the copy of the second data block was stored by the second media agent based at least in part on accessing the copy of the second hash signature, and the copy of the second header stored in association with the first deduplication node; and
sending a second request for the second data block to the second media agent that requests the second data block, wherein the second request comprises at least the second header.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed, deduplicated storage system according to certain embodiments is arranged in a parallel configuration including multiple deduplication nodes. Deduplicated data is distributed across the deduplication nodes. The deduplication nodes can be networked together and communicate with one another according using a light-weight, customized communication scheme (e.g., a scheme based on FTP or HTTP). In some cases, deduplication management information including deduplication signatures and/or other metadata is stored separately from the deduplicated data in deduplication management nodes, improving performance and scalability.
-
Citations
20 Claims
-
1. A method of performing a storage operation in a distributed deduplicated storage system, comprising:
-
creating with a first deduplication node of a plurality of deduplication nodes, a first hash signature of a first data block of a plurality of data blocks associated with a file, and a first header that at least identifies a first media agent that stored a copy of the first data block; creating with a second deduplication node of the plurality of deduplication nodes, a second hash signature of at least a second data block associated with the file, and a second header that at least identifies a second media agent that stored a copy of the second data block; storing at the first deduplication node a copy of the second hash signature, and a copy of the second header created by the second deduplication node; receiving a first request from a client computing device to restore the file comprising the plurality of data blocks; determining with the first deduplication node that the copy of the second data block was stored by the second media agent based at least in part on accessing the copy of the second hash signature, and the copy of the second header stored in association with the first deduplication node; and sending a second request for the second data block to the second media agent that requests the second data block, wherein the second request comprises at least the second header. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A distributed deduplicated storage system, comprising:
-
a plurality of deduplication nodes each comprising one or more processors and storage, the deduplication nodes in communication with one another, a first deduplication node of the plurality of deduplication nodes creates a first hash signature of a first data block of a plurality of data blocks associated with a file and a first header that at least identifies a first media agent that stored a copy of the first data block; and a second deduplication node of the plurality of deduplication nodes creates a second hash signature of at least a second data block associated with the file and a second header that at least identifies a second media agent that stored a copy of the second data block, wherein the first deduplication node stores a copy of second hash signature and the second header created by the second depulication node; computer hardware configured to; receive a request for the file comprised of a plurality of data blocks; determine with the first deduplication node that the copy of the second data block was stored by the second media agent based at least in part on accessing the copy of the second hash signature and the copy of the second header stored in association with the first deduplication node; and sending a second request for the second data block to the second media agent wherein the second request comprises at least the second header. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification