Methods and systems for improved throughput performance in a distributed data de-duplication environment
First Claim
1. A method of data storage in a data de-duplication system comprising:
- controlling a local client node to parse a stream of data received at the local client node into a set of variable length blocks at the local client node;
determining, at the local client node, a code that represents a block of data parsed from the stream, the code being a hash of the block;
controlling the local chant node to send the code representing the block of data to a server, where the code is sent over a network;
receiving, at the local client node, from the server, a notification that the block is unique as identified by the server in response to examining the code;
in response to receiving the notification from the server at the local client node, controlling the local client node to write the block identified as a unique block by the notification to storage associated with the local client node;
in response to receiving the notification from the server at the local client node, controlling the local client node to write the code associated with the unique block to a file at the local client node, the file being located on a storage device at the local client node, the file being configured to facilitate performing uniqueness comparisons at the local client node;
updating metadata at the server, where the metadata is associated with the existence of the unique block, the code associated with the unique block, and the location of the unique block, andupdating an index at the server with information concerning the existence of the unique block, the code associated with the unique block, and the location of the unique block.
10 Assignments
0 Petitions
Accused Products
Abstract
In accordance with some embodiments, of the systems and methods described here a data storage system that may include data de-duplication may receive a stream of data and parse the stream of data into a block at a local client node. Additionally, in some embodiments, a code that represents the block of data might be determined at the local client node. This code, representing the block of data, may be sent to a server. In accordance with various embodiments, the server may determine if a block is unique, for example, based on the code received at the server. In various embodiments, the server might write a unique block to a file at the local client node; and update metadata.
-
Citations
11 Claims
-
1. A method of data storage in a data de-duplication system comprising:
-
controlling a local client node to parse a stream of data received at the local client node into a set of variable length blocks at the local client node; determining, at the local client node, a code that represents a block of data parsed from the stream, the code being a hash of the block; controlling the local chant node to send the code representing the block of data to a server, where the code is sent over a network; receiving, at the local client node, from the server, a notification that the block is unique as identified by the server in response to examining the code; in response to receiving the notification from the server at the local client node, controlling the local client node to write the block identified as a unique block by the notification to storage associated with the local client node; in response to receiving the notification from the server at the local client node, controlling the local client node to write the code associated with the unique block to a file at the local client node, the file being located on a storage device at the local client node, the file being configured to facilitate performing uniqueness comparisons at the local client node; updating metadata at the server, where the metadata is associated with the existence of the unique block, the code associated with the unique block, and the location of the unique block, and updating an index at the server with information concerning the existence of the unique block, the code associated with the unique block, and the location of the unique block. - View Dependent Claims (2)
-
-
3. A data storage system, comprising:
-
a server configured to determine if a block is unique based on a code received at the server from a local client node, and to send a notification to the local client node that the block is unique, the server and the local client node being computer hardware; the local client node being configured to; parse the stream of data into a block at the local client node; determine and store the code that represents the block of data; send the code representing the block of data to the server; write the block to a file at the local client node based on the determination that the block is unique, the file being located on a storage device at the local client node, to selectively write the code associated with the unique block to storage associated with the local client node in response to receiving the notification at the local client node, the storage and code being configured to facilitate performing uniqueness comparisons at the local client node; the server being configured to update metadata at the server to record the existence of the unique block, the code associated with the unique block, and the location of the unique block, and the server being configured to update an index at the server with information concerning the existence of the unique block, the code associated with the unique block, and the location of the unique block. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11)
-
Specification