Optimized segment cleaning technique
First Claim
1. A method comprising:
- receiving a write request directed towards a logical unit (LUN), the write request having data and processed at a node of a cluster, the node having a processor, a memory and connected to a storage array of solid state drives (SSDs);
generating a first key from the data;
storing the data as a first extent in a first segment according to a log-structured layout, the first segment spanning a set of the SSDs, the first segment associated with a first segment identifier (ID), the first extent including the first key;
using an in-core metadata table to determine whether the first extent is valid when the first segment is above a capacity threshold;
reading the first extent from the first segment to determine whether the first extent is valid when the first segment is below the capacity threshold;
in response to determining that the first extent is valid, cleaning the first segment by copying the first extent from the first segment to a second segment, the second segment having the log-structured layout and associated with a second segment ID different than the first segment ID; and
updating an entry of the in-core metadata table to include the second segment ID, the entry including the first key.
1 Assignment
0 Petitions
Accused Products
Abstract
An optimized segment cleaning technique is configured to efficiently clean one or more selected portions or segments of a storage array coupled to one or more nodes of a cluster. A bottom-up approach of the segment cleaning technique is configured to read all blocks of a segment to be cleaned (i.e., an “old” segment) to locate extents stored on the SSDs of the old segment and examine extent metadata to determine whether the extents are valid and, if so, relocate the valid extents to a segment being written (i.e., a “new” segment). A top-down approach of the segment cleaning technique obviates reading of the blocks of the old segment to locate the extents and, instead, examines the extent metadata to determine the valid extents of the old segment. A hybrid approach may extend the top-down approach to include only full stripe read operations needed for relocation and reconstruction of blocks as well as retrieval of valid extents from the stripes, while also avoiding any unnecessary read operations of the bottom-down approach.
628 Citations
20 Claims
-
1. A method comprising:
-
receiving a write request directed towards a logical unit (LUN), the write request having data and processed at a node of a cluster, the node having a processor, a memory and connected to a storage array of solid state drives (SSDs); generating a first key from the data; storing the data as a first extent in a first segment according to a log-structured layout, the first segment spanning a set of the SSDs, the first segment associated with a first segment identifier (ID), the first extent including the first key; using an in-core metadata table to determine whether the first extent is valid when the first segment is above a capacity threshold; reading the first extent from the first segment to determine whether the first extent is valid when the first segment is below the capacity threshold; in response to determining that the first extent is valid, cleaning the first segment by copying the first extent from the first segment to a second segment, the second segment having the log-structured layout and associated with a second segment ID different than the first segment ID; and updating an entry of the in-core metadata table to include the second segment ID, the entry including the first key. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
receiving a write request directed towards a logical unit (LUN), the write request having data and processed at a node of a cluster, the node having a memory and connected to a storage array of solid state drives (SSDs); associating the data with a first key; storing the data as a first extent in a first segment according to a log-structured layout, the first segment spanning a set of the SSDs, the first segment associated with a first segment identifier (ID), the first extent including the first key; scanning an in-core metadata table using the first segment ID to find an entry to determine whether the first extent is valid when the first segment is above a capacity threshold, the entry having the first segment ID and the first key; reading the first extent from the first segment to determine whether the first extent is valid when the first segment is below the capacity threshold; in response to determining that the first extent is valid, copying the first extent from the first segment to a location on a second segment, the second segment having the log-structured layout and associated with a second segment ID different than the first segment ID; and updating the entry of the in-core metadata table to include the second segment ID and the location.
-
-
11. A system comprising:
-
a storage system having a memory connected to a processor; a storage array coupled to the storage system having one or more solid state drives (SSDs); a storage I/O stack executing on the processor of the storage system, the storage I/O stack configured to; receive a write request directed towards a logical unit (LUN), the write request having data; generate a first key from the data; store the data as a first extent in a first segment according to a log-structured layout, the first segment spanning a set of the SSDs, the first segment associated with a first segment identifier (ID), the first extent including the first key; use an in-core metadata table to determine whether the first extent is valid when the first segment is above a capacity threshold; read the first extent from the first segment to determine whether the first extent is valid when the first segment is below the capacity threshold; in response to determining that the first extent is valid, clean the first segment by copying the first extent from the first segment to a second segment, the second segment having the log-structured layout and associated with a second segment ID different than the first segment ID; and update an entry of the in-core metadata table to include the second segment ID, the entry including the first key. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification