Global in-line extent-based deduplication
First Claim
1. A system comprising:
- a central processing unit (CPU) of a node of a cluster having a plurality of nodes, each node coupled to one or more storage arrays of solid state drives (SSDs); and
a memory coupled to the CPU and configured to store a layered file system of a storage input/output (I/O) stack, the layered file system including a volume layer and an extent store layer configured to provide sequential log-structured layout of data and metadata on the SSDs, the data and metadata organized as variable-length extents of one or more logical units (LUNs) served by the nodes, the metadata including volume metadata mappings from offset ranges of a LUN to extent keys and extent metadata mappings of the extent keys to storage locations of the extents on the SSDs, wherein the extent store layer maintaining the extent metadata mappings is configured todetermine whether a data extent is stored on a storage array of the cluster and,in response to determination that the data extent is stored on the storage array of the cluster, retrieve an extent key for the stored data extent from an extent metadata mapping having a storage location on a SSD for the stored data extent, and return the extent key for the stored data extent to the volume layer to enable selective inline de-duplication that obviates writing a duplicate copy of the data extent on the storage array, wherein a flag passed to the extent store layer enables the selective inline de-duplication such that a metadata extent is not de-duplicated.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a layered file system includes a volume layer and an extent store layer configured to provide sequential log-structured layout of data and metadata on solid state drives (SSDs) of one or more storage arrays. The data is organized as variable-length extents of one or more logical units (LUNs). The metadata includes volume metadata mappings from offset ranges of a LUN to extent keys and extent metadata mappings of the extent keys to storage locations of the extents on the SSDs. The extent store layer maintaining the extent metadata mappings determines whether an extent is stored on a storage array, and, in response to determination that the extent is stored on the storage array, returns an extent key for the stored extent to the volume layer to enable global inline de-duplication that obviates writing a duplicate copy of the extent on the storage array.
65 Citations
20 Claims
-
1. A system comprising:
-
a central processing unit (CPU) of a node of a cluster having a plurality of nodes, each node coupled to one or more storage arrays of solid state drives (SSDs); and a memory coupled to the CPU and configured to store a layered file system of a storage input/output (I/O) stack, the layered file system including a volume layer and an extent store layer configured to provide sequential log-structured layout of data and metadata on the SSDs, the data and metadata organized as variable-length extents of one or more logical units (LUNs) served by the nodes, the metadata including volume metadata mappings from offset ranges of a LUN to extent keys and extent metadata mappings of the extent keys to storage locations of the extents on the SSDs, wherein the extent store layer maintaining the extent metadata mappings is configured to determine whether a data extent is stored on a storage array of the cluster and, in response to determination that the data extent is stored on the storage array of the cluster, retrieve an extent key for the stored data extent from an extent metadata mapping having a storage location on a SSD for the stored data extent, and return the extent key for the stored data extent to the volume layer to enable selective inline de-duplication that obviates writing a duplicate copy of the data extent on the storage array, wherein a flag passed to the extent store layer enables the selective inline de-duplication such that a metadata extent is not de-duplicated. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method comprising:
-
providing, by a node of a cluster, a layered file system, the layered file system including a volume layer and an extent store layer configured to provide sequential log-structured layout of data and metadata on solid state drives (SSDs) of one or more storage arrays of the cluster, the data and metadata organized as variable-length extents of one or more logical units (LUNs), the metadata including volume metadata mappings from offset ranges of a LUN to extent keys and extent metadata mappings of the extent keys to storage locations of the extents on the SSDs; determining whether a data extent is stored on a storage array of the cluster; and in response to determining that the data extent is stored on the storage array of the cluster, retrieving an extent key for the stored data extent from an extent metadata mapping having a storage location on a SSD for the stored data extent, and returning the extent key for the stored data extent to the volume layer to enable inline de-duplication that obviates writing a duplicate copy of the data extent on the storage array, wherein a flag passed to the extent store layer enables the selective inline de-duplication such that a metadata extent is not de-duplicated. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A non-transitory computer readable medium including program instructions for execution on one or more processors, the program instructions when executed operable to:
-
provide a layered file system, the layered file system including a volume layer and an extent store layer configured to provide sequential log-structured layout of data and metadata on solid state drives (SSDs) of one or more storage arrays of a cluster, the data and metadata organized as variable-length extents of one or more logical units (LUNs), the metadata including volume metadata mappings from offset ranges of a LUN to extent keys and extent metadata mappings of the extent keys to storage locations of the extents on the SSDs; determine whether a data extent is stored on a storage array of the cluster; and in response to determination that the data extent is stored on the storage array of the cluster, retrieve an extent key for the stored data extent from an extent metadata mapping having a storage location on a SSD for the stored data extent, and return the extent key for the stored data extent to the volume layer to enable inline de-duplication that obviates writing a duplicate copy of the extent on the storage array, wherein a flag passed to the extent store layer enables the selective inline de-duplication such that a metadata extent is not de-duplicated. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification