Tracing garbage collector for search trees under multi-version concurrency control
First Claim
Patent Images
1. A method for garbage collection in a storage system having a plurality of storage devices, the method comprising:
- identifying a plurality of storage chunks that are marked as full, the storage chunks storing a plurality of tree elements, the storage chunks being formed on one or more of the storage devices;
classifying each of the plurality of tree elements as either a live tree element or an unreferenced tree element by traversing a plurality of search trees to identify ones of the plurality of tree elements that are currently active, wherein only tree elements in storage chunks that are marked as full are classified as either a live tree element or an unreferenced tree element; and
identifying one or more of the plurality of storage chunks that include only unreferenced tree elements and no live tree elements, and reclaiming the identified one or more storage chunks.
13 Assignments
0 Petitions
Accused Products
Abstract
Structures and processes for garbage collection of search trees under Multi-Version Concurrency Control (MVCC). Such search trees may be used to store data within a distributed storage system. A process detects live search tree elements using tracing and then identify storage chunks having no live elements as garbage to be reclaimed. The process can be paused and resumed to reduce impact on other system processing.
-
Citations
18 Claims
-
1. A method for garbage collection in a storage system having a plurality of storage devices, the method comprising:
-
identifying a plurality of storage chunks that are marked as full, the storage chunks storing a plurality of tree elements, the storage chunks being formed on one or more of the storage devices; classifying each of the plurality of tree elements as either a live tree element or an unreferenced tree element by traversing a plurality of search trees to identify ones of the plurality of tree elements that are currently active, wherein only tree elements in storage chunks that are marked as full are classified as either a live tree element or an unreferenced tree element; and identifying one or more of the plurality of storage chunks that include only unreferenced tree elements and no live tree elements, and reclaiming the identified one or more storage chunks. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A distributed storage system, comprising:
-
a plurality of storage devices; and one or more processors operatively coupled to the plurality of storage devices, the one or more processors being configured to; identify a plurality of storage chunks that are marked as full, the storage chunks storing a plurality of tree elements, the storage chunks being formed on one or more of the storage devices; classify each of the plurality of tree elements as either a live tree element or an unreferenced tree element by traversing a plurality of search trees to identify ones of the plurality of tree elements that are currently active, wherein only tree elements in the storage chunks that are marked as full are classified as either a live tree element or an unreferenced tree element; and identify one or more of the plurality of storage chunks that include only unreferenced tree elements and no live tree elements, and reclaim the identified one or more storage chunks. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification