Data storage space recovery
First Claim
1. A computer-implemented method of managing storage of data objects to a data storage environment comprising a plurality of storage nodes, the method comprising:
- receiving write requests for data objects;
storing the data objects at the storage nodes according to the write requests;
updating a storage manager catalog that maps data object identifications (DOIDs) for the data objects with actual storage locations of the data objects, wherein the DOID for a data object is calculated based on content of the data object;
wherein, if the data object is a revised version of a previously stored data object, the revised data object has a different DOID than the previously stored data object, the revised data object is stored at a different storage location than the previously stored data object and without overwriting the previously stored data object, and the storage manager catalog is updated to reflect that the previously stored data object has been superseded by the revised version by decreasing a count of a number of instances of the previously stored data object by one and increasing a count of a number of instances of the revised data object by one;
identifying a storage area to recover storage space, the storage area storing data objects that are indicated as stale or active in the storage manager catalog; and
recovering the storage space while fulfilling read and write requests;
wherein the process of recovering storage space is implemented as compaction requests to compact the data objects from the storage area to a shadow storage area, the compaction requests being interspersed with read and write requests to the storage nodes, andwherein read requests for the active data objects are fulfilled from the storage area and write requests for new data objects are fulfilled using the shadow storage area during the process of recovering storage space.
5 Assignments
0 Petitions
Accused Products
Abstract
Storage space is reclaimed by cleaning and compacting data objects where data objects are stored by immutable storage. A storage area of which space needs to be reclaimed is identified. Active and stale data objects stored in a storage area are identified, and only active data objects are transferred to a shadow storage area from the storage area when recovering storage space. I/O operations can be fulfilled from the storage area and the shadow storage area. Compaction requests and I/O requests are throttled according to QOS parameters. Recovery of storage space does not cause a failure to meet performance requirements for any storage volume.
4 Citations
21 Claims
-
1. A computer-implemented method of managing storage of data objects to a data storage environment comprising a plurality of storage nodes, the method comprising:
-
receiving write requests for data objects; storing the data objects at the storage nodes according to the write requests; updating a storage manager catalog that maps data object identifications (DOIDs) for the data objects with actual storage locations of the data objects, wherein the DOID for a data object is calculated based on content of the data object; wherein, if the data object is a revised version of a previously stored data object, the revised data object has a different DOID than the previously stored data object, the revised data object is stored at a different storage location than the previously stored data object and without overwriting the previously stored data object, and the storage manager catalog is updated to reflect that the previously stored data object has been superseded by the revised version by decreasing a count of a number of instances of the previously stored data object by one and increasing a count of a number of instances of the revised data object by one; identifying a storage area to recover storage space, the storage area storing data objects that are indicated as stale or active in the storage manager catalog; and recovering the storage space while fulfilling read and write requests; wherein the process of recovering storage space is implemented as compaction requests to compact the data objects from the storage area to a shadow storage area, the compaction requests being interspersed with read and write requests to the storage nodes, and wherein read requests for the active data objects are fulfilled from the storage area and write requests for new data objects are fulfilled using the shadow storage area during the process of recovering storage space. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer-readable storage medium storing computer program modules for managing storage of data objects to a data storage environment comprising a plurality of storage nodes, the computer program modules executable to perform steps comprising:
-
receiving write requests for data objects; storing the data objects at the storage nodes according to the write requests; updating a storage manager catalog that maps data object identifications (DOIDs) for the data objects with actual storage locations of the data objects, wherein the DOID for a data object is calculated based on content of the data object; wherein, if the data object is a revised version of a previously stored data object, the revised data object has a different DOID than the previously stored data object, the revised data object is stored at a different storage location than the previously stored data object and without overwriting the previously stored data object, and the storage manager catalog is updated to reflect that the previously stored data object has been superseded by the revised version by decreasing a count of a number of instances of the previously stored data object by one and increasing a count of a number of instances of the revised data object by one; identifying a storage area to recover storage space, the storage area storing data objects that are indicated as stale or active in the storage manager catalog; and recovering the storage space while fulfilling read and write requests; wherein the process of recovering storage space is implemented as compaction requests to compact the data objects from the storage area to a shadow storage area, the compaction requests being interspersed with read and write requests to the storage nodes, and wherein read requests for the active data objects are fulfilled from the storage area and write requests for new data objects are fulfilled using the shadow storage area during the process of recovering storage space.
-
-
21. A data storage environment comprising:
-
a plurality of application nodes that send application read requests and application write requests for data objects; a plurality of storage nodes in communication with the application nodes, the storage nodes for storing the data objects organized as storage volumes, the storage nodes comprising; a storage manager catalog that maps data object identifications (DOIDs) for the data objects with actual storage locations of the data objects, wherein the DOD for a data object is calculated based on content of the data object;
wherein, if the data object is a revised version of a previously stored data object, the revised data object has a different DOID than the previously stored data object, the revised data object is stored at a different storage location than the previously stored data object and without overwriting the previously stored data object, and the storage manager catalog is updated to reflect that the previously stored data object has been superseded by the revised version by decreasing a count of a number of instances of the previously stored data object by one and increasing a count of a number of instances of the revised data object by one; anda storage manager compaction module that identifies a storage area to recover storage space and recovers the storage space allocated for data objects that are indicated as stale in the storage manager catalog, wherein the process of recovering storage space is implemented as compaction requests to compact data objects stored in the storage area to a shadow storage area, the compaction requests being interspersed with read and write requests to the storage nodes, wherein read requests for the active data objects are fulfilled from the storage area and write requests for new data objects are fulfilled using the shadow storage area during the process of recovering storage space.
-
Specification