Distributed deduplicated storage system
First Claim
Patent Images
1. A distributed deduplicated storage system comprising:
- a client device; and
one or more hardware processors configured with software instructions to perform operations of;
a data media agent, wherein the data media agent is configured to;
receive data from the client device,split the data into at least first and second data blocks, andtransmit in parallel the first data block to a first deduplication database media agent and the second data block to a second deduplication database media agent;
the first deduplication database media agent having first deduplication management information associated therewith, wherein the first deduplication management information is stored separately from deduplicated data stored in secondary storage devices, the first deduplication database media agent configured to;
determine whether the first data block is stored in a secondary storage device by querying the first deduplication management information; and
the second deduplication database media agent having second deduplication management information associated therewith, wherein the second deduplication management information is stored separately from the deduplicated data stored in the secondary storage devices, the second deduplication database media agent configured to;
determine whether the second data block is stored in a secondary storage device by querying the second deduplication management information.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed, deduplicated storage system according to certain embodiments is arranged in a parallel configuration including multiple deduplication nodes. Deduplicated data is distributed across the deduplication nodes. The deduplication nodes can be networked together and communicate with one another according using a light-weight, customized communication scheme (e.g., a scheme based on FTP or HTTP). In some cases, deduplication management information including deduplication signatures and/or other metadata is stored separately from the deduplicated data in deduplication management nodes, improving performance and scalability.
-
Citations
18 Claims
-
1. A distributed deduplicated storage system comprising:
-
a client device; and one or more hardware processors configured with software instructions to perform operations of; a data media agent, wherein the data media agent is configured to; receive data from the client device, split the data into at least first and second data blocks, and transmit in parallel the first data block to a first deduplication database media agent and the second data block to a second deduplication database media agent; the first deduplication database media agent having first deduplication management information associated therewith, wherein the first deduplication management information is stored separately from deduplicated data stored in secondary storage devices, the first deduplication database media agent configured to; determine whether the first data block is stored in a secondary storage device by querying the first deduplication management information; and the second deduplication database media agent having second deduplication management information associated therewith, wherein the second deduplication management information is stored separately from the deduplicated data stored in the secondary storage devices, the second deduplication database media agent configured to; determine whether the second data block is stored in a secondary storage device by querying the second deduplication management information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A distributed deduplicated storage system comprising:
-
a plurality of deduplication nodes each comprising one or more processors, one or more deduplication databases, and one or more storage devices, wherein the plurality of deduplication nodes are in communication with one another; a first deduplication node of the plurality of deduplication nodes creates a first hash signature of a first data block of a plurality of data blocks associated with a file, wherein a copy of the first hash signature is stored in a first deduplication database, wherein the first deduplication database is stored separately from data stored on the one or more storage devices; a first media agent that stores a copy of the first data block; a second media agent that stores a copy of a second data block; a second deduplication node of the plurality of deduplication nodes creates a second hash signature of the second data block associated with the file, wherein the first deduplication node stores a copy of the second hash signature in a second deduplication database, wherein the second deduplication database is stored separately from data stored on the one or more storage devices; and computer hardware configured to; receive a request for the file comprised of the plurality of data blocks, determine with the first deduplication node that the copy of the second data block was stored based at least in part on accessing the copy of the second hash signature, and send a second request for the second data block. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification