Tombstones for no longer relevant deduplication entries
First Claim
1. A system comprising:
- a storage array comprising a plurality of data blocks; and
a storage controller coupled to the storage array, the storage controller comprising a processing device, the processing device to;
identify a canonical instance of a data block in a vector associated with a deduplication map, the vector represents a plurality of updates to the deduplication map over a determined time period;
select, from the deduplication map, a deduplication reference representing duplicate data of the data block in the storage array, wherein the canonical instance represents an earliest occurrence of the duplicate data of the data block in the vector associated with the deduplication map;
remap the deduplication reference in the deduplication map to point to the canonical instance;
update an entry in the deduplication map for the deduplication reference with a record based on the remapped the deduplication reference; and
responsive to detecting that the entry is in a location associated with an original entry of the data block in the deduplication map, delete the entry with the record.
1 Assignment
0 Petitions
Accused Products
Abstract
An implementation of the disclosure provides a system comprising a storage array comprising a plurality of data blocks and a storage controller coupled to the storage array. The storage controller comprising a processing device to identify a canonical instance of a data block in a vector associated with a deduplication map. The vector represents a plurality of updates to the deduplication map over a determined time period. A deduplication reference representing duplicate data of the data block in the storage array is select from the deduplication map. The deduplication reference is remapped in the deduplication map to point to the canonical instance. Based on the remapping, an entry in the deduplication map for the deduplication reference is updated with a record. Responsive to detecting that the entry is in a location associated with an original entry of the data block in the deduplication map, delete the entry with the record.
-
Citations
17 Claims
-
1. A system comprising:
-
a storage array comprising a plurality of data blocks; and a storage controller coupled to the storage array, the storage controller comprising a processing device, the processing device to; identify a canonical instance of a data block in a vector associated with a deduplication map, the vector represents a plurality of updates to the deduplication map over a determined time period; select, from the deduplication map, a deduplication reference representing duplicate data of the data block in the storage array, wherein the canonical instance represents an earliest occurrence of the duplicate data of the data block in the vector associated with the deduplication map; remap the deduplication reference in the deduplication map to point to the canonical instance; update an entry in the deduplication map for the deduplication reference with a record based on the remapped the deduplication reference; and responsive to detecting that the entry is in a location associated with an original entry of the data block in the deduplication map, delete the entry with the record. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method comprising:
-
identifying, by a processing device, a canonical instance of a data block associated with a deduplication map; selecting, by the processing device, a deduplication reference from the deduplication map, the deduplication reference represents duplicate data of the data block, wherein the canonical instance represents an earliest occurrence of the duplicate data of the data block in an identified vector for the deduplication map; remaping the deduplication reference in the deduplication map to point to the canonical instance; update an entry in the deduplication map for the deduplication reference with a record based on the remaping; determining, by the processing device, that a location of the entry corresponds to a vector associated with an original entry of the data block in the deduplication map, the vector represents a range of sequence identifiers associated with updates to the deduplication map; and performing, in view of the determining, a trimming process to trim the entry with the record from the deduplication map. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A non-transitory computer readable storage medium storing instructions, which when executed, cause a processing device to:
-
select, by the processing device, a canonical instance for a data block associated with a plurality of deduplication references in a deduplication map, the deduplication references represent duplicate data of the data block in a vector of the deduplication map; remap the plurality of deduplication references in the vector to point to the canonical instance, wherein the canonical instance represents an earliest occurrence of the duplicate data of the data block in an identified vector for the deduplication map; update each entry associated with the plurality of deduplication references with a record based on remapping the plurality of deduplication references; and responsive to detecting that the location of an entry corresponds to an original entry of the data block in the deduplication map, trim entries from the deduplication map that are associated with each record. - View Dependent Claims (13, 14, 15, 16, 17)
-
Specification