Data de-duplication for iSCSI
First Claim
1. In a computer network that implements the internet small computer systems interface protocol to enable storage access across the computer network, a method for de-duplicating data in-band in the computer network, the method comprising:
- receiving a first data block for storage at a first storage address;
assigning a first probabilistically unique identifier to the first data block;
comparing the first probabilistically unique identifier to a plurality of second probabilistically unique identifiers assigned to a plurality of second data blocks previously stored in the network;
if the first probabilistically unique identifier is identical to a second probabilistically unique identifier assigned to a second data block, indicating that the first data block is identical to the second data block stored at a second storage address, adding an entry to a routing table to re-route future read requests specifying the first storage address to the second storage address without storing the first data block at the first storage address, wherein the entry establishes a relationship between the first storage address and the second probabilistically unique identifier and future requests for the first storage address are translated into the second probabilistically unique identifier to re-route the future requests for the first storage address to the second storage address.
9 Assignments
0 Petitions
Accused Products
Abstract
Redundant data is identified and eliminated in a network that implements the iSCSI protocol either in-band at the source, in-band at the target, or out-of-band at the target. For in-band de-duplication, a data block included with a write command is assigned a unique identifier that is compared to a database of unique identifiers corresponding to previously written data. If the unique identifier is identical to an existing unique identifier, this indicates that the data block is redundant and has previously been stored elsewhere, in which case it is not stored again. Instead, the storage address specified in the write command may be added to a routing table showing the equivalence of unique identifiers, actual storage addresses, and duplicate storage addresses. When a read request specifying a duplicate storage address is received, the duplicate storage address can be translated to a corresponding unique identifier which points to the actual storage address.
28 Citations
20 Claims
-
1. In a computer network that implements the internet small computer systems interface protocol to enable storage access across the computer network, a method for de-duplicating data in-band in the computer network, the method comprising:
-
receiving a first data block for storage at a first storage address; assigning a first probabilistically unique identifier to the first data block; comparing the first probabilistically unique identifier to a plurality of second probabilistically unique identifiers assigned to a plurality of second data blocks previously stored in the network; if the first probabilistically unique identifier is identical to a second probabilistically unique identifier assigned to a second data block, indicating that the first data block is identical to the second data block stored at a second storage address, adding an entry to a routing table to re-route future read requests specifying the first storage address to the second storage address without storing the first data block at the first storage address, wherein the entry establishes a relationship between the first storage address and the second probabilistically unique identifier and future requests for the first storage address are translated into the second probabilistically unique identifier to re-route the future requests for the first storage address to the second storage address. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. In a computer network that implements the internet small computer systems interface to enable storage access across the computer network, a method for de-duplicating data in-band in a computer network, the method comprising:
-
receiving a first data block for storage and a small computer systems interface write command specifying a first storage address in the network for storing the first data block; determining that the first data block is a duplicate of a second data block stored at a second storage address in the network; establishing a relationship between the first storage address and the second storage address in a routing table; and storing an entry to the routing table to re-route requests to the first storage address to the second storage address based on the relationship without storing the first data block at the first storage address, wherein, upon receiving a small computer systems interface read command that specifies the first storage address, the routing table redirects the read command to the second data block at the second storage address, wherein future requests for the first storage block are translated into requests for the second storage block to re-route the future requests for the first storage block to the second storage block. - View Dependent Claims (12, 13, 14, 15)
-
-
16. In a computer network that implements the internet small computer systems interface protocol to enable storage access across the network, a method for de-duplicating data out-of-band in a computer network, the method comprising:
-
storing a first data block at a first storage address in the network and a second data block at a second storage address in the network; determining that the second data block is identical to the first data block; deleting the second data block from the second storage address; and adding an entry to a routing table to re-route future small computer systems interface read commands specifying the second storage address to the first storage address, wherein the entry establishes a relationship between the second storage address and a unique identifier and future requests for the second storage address are translated into the unique identifier to re-route the future requests for the first storage address to the first storage address. - View Dependent Claims (17, 18, 19, 20)
-
Specification