Deduplication aware scalable content placement
First Claim
1. A system comprising:
- a storage array comprising a plurality of solid state drives; and
a storage controller coupled to one of the plurality of solid state drives, the storage controller comprising a processing device, the processing device to;
receive data to be stored in the storage array;
calculate a plurality of hashes corresponding to the data to be stored by utilizing a rolling hash algorithm on the data to be stored;
determine a first subset of the plurality of hashes corresponding to the data to be stored;
determine a second subset of the plurality of hashes of the first subset;
generate, in view of the second subset, a candidate placement list, wherein the candidate placement list comprises less than all of the plurality of solid state drives;
send the first subset of the plurality of hashes to one or more solid state drives, of the plurality of solid state drives, represented on the candidate placement list;
receive, from the one or more solid state drives represented on the candidate placement list, in response to sending the first subset of the plurality of hashes to the one or more solid state drives, characteristics corresponding to the one or more solid state drives represented on the candidate placement list; and
in view of the characteristics, identify one of the one or more solid state drives represented on the candidate placement list and send the data to the identified solid state drive.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods of deduplication aware scalable content placement are described. A method may include receiving data to be stored on one or more nodes of a storage array and calculating a plurality of hashes corresponding to the data. The method further includes determining a first subset of the plurality of hashes, determining a second subset of the plurality of hashes of the first subset, and generating a node candidate placement list. The method may further include sending the first subset to one or more nodes represented on the node candidate placement list and receiving, from the nodes represented on the node candidate placement list, characteristics corresponding to the nodes represented on the candidate placement list. The method may further include identifying one of the one or more nodes represented on the candidate placement list in view of the characteristic and sending the data to the identified node.
-
Citations
17 Claims
-
1. A system comprising:
-
a storage array comprising a plurality of solid state drives; and a storage controller coupled to one of the plurality of solid state drives, the storage controller comprising a processing device, the processing device to; receive data to be stored in the storage array; calculate a plurality of hashes corresponding to the data to be stored by utilizing a rolling hash algorithm on the data to be stored; determine a first subset of the plurality of hashes corresponding to the data to be stored; determine a second subset of the plurality of hashes of the first subset; generate, in view of the second subset, a candidate placement list, wherein the candidate placement list comprises less than all of the plurality of solid state drives; send the first subset of the plurality of hashes to one or more solid state drives, of the plurality of solid state drives, represented on the candidate placement list; receive, from the one or more solid state drives represented on the candidate placement list, in response to sending the first subset of the plurality of hashes to the one or more solid state drives, characteristics corresponding to the one or more solid state drives represented on the candidate placement list; and in view of the characteristics, identify one of the one or more solid state drives represented on the candidate placement list and send the data to the identified solid state drive. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method comprising:
-
receiving data to be stored on a plurality of nodes of a storage array; calculating, by a processing device, a plurality of hashes corresponding to the data; determining a first subset of the plurality of hashes corresponding to the data by utilizing a rolling hash algorithm on the data to be stored; determining a second subset of the plurality of hashes of the first subset; generating, by the processing device, in view of the second subset, a candidate placement list, wherein the candidate placement list comprises less than all of the plurality of nodes; sending, by the processing device, the first subset of the plurality of hashes to one or more nodes represented on the candidate placement list; receiving, from the one or more nodes represented on the candidate placement list, in response to sending the first subset of the plurality of hashes to the one or more nodes, characteristics corresponding to the one or more nodes represented on the candidate placement list; and in view of the characteristics, identifying one of the one or more nodes represented on the candidate placement list and sending the data to the identified node. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A non-transitory computer readable storage medium storing instructions, which when executed, cause a processing device to:
-
receive data to be stored on a plurality of nodes of a storage array; calculate a plurality of hashes corresponding to the data by utilizing a rolling hash algorithm on the data to be stored; determine a first subset of the plurality of hashes corresponding to the data; generate, in view of the first subset, a candidate placement list, wherein the candidate placement list comprises less than all of the plurality of nodes; and send the first subset of the plurality of hashes to one or more nodes represented on the candidate placement list; receive, from the one or more nodes, of the plurality of nodes represented on the candidate placement list, in response to sending the first subset of the plurality of hashes to the one or more nodes, characteristics corresponding to the one or more nodes represented on the candidate placement list; in view of the characteristics, identify the identified node of the one or more nodes represented on the candidate placement list and send the data to an identified node of the candidate placement list. - View Dependent Claims (15, 16, 17)
-
Specification