Deduplication in an extent-based architecture
First Claim
1. A computerized method for performing deduplication in an extent-based architecture including a storage server, the method comprising:
- receiving, by the storage server, a request to remove duplicate data in the storage server;
accessing a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and
matching an entry in the log data container with another entry in the log data containerdetermining a donor extent and a recipient extent, and upon determining an external reference count associated with the recipient extent equals a first predetermined value, performing block sharing for the donor extent and the recipient extent, and upon determining the reference count of the donor extent equals a second predetermined value, freeing the donor extent.
2 Assignments
0 Petitions
Accused Products
Abstract
A request is received to remove duplicate data. A log data container associated with a storage volume in a storage server is accessed. The log data container includes a plurality of entries. Each entry is identified by an extent identifier in a data structures stored in a volume associated with the storage server. For each entry in the log data container, a determination is made if the entry matches another entry in the log data container. If the entry matches another entry in the log data container, a determination is made of a donor extent and a recipient extent. If an external reference count associated with the recipient extent equals a first predetermined value, block sharing is performed for the donor extent and the recipient extent. A determination is made if the reference count of the donor extent equals a second predetermined value. If the reference count of the donor extent equals the second predetermined value, the donor extent is freed.
58 Citations
26 Claims
-
1. A computerized method for performing deduplication in an extent-based architecture including a storage server, the method comprising:
-
receiving, by the storage server, a request to remove duplicate data in the storage server; accessing a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and matching an entry in the log data container with another entry in the log data container determining a donor extent and a recipient extent, and upon determining an external reference count associated with the recipient extent equals a first predetermined value, performing block sharing for the donor extent and the recipient extent, and upon determining the reference count of the donor extent equals a second predetermined value, freeing the donor extent. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-readable storage medium embodied with executable instructions that cause a processor to perform operations for deduplication in an extent-based architecture including a storage server, the operations comprising:
-
receiving a request to remove duplicate data in the storage server; accessing a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and matching an entry in the log data container with another entry in the log data container determining a donor extent and a recipient extent, and upon determining an external reference count associated with the recipient extent equals a first predetermined value, performing block sharing for the donor extent and the recipient extent, and upon determining the reference count of the donor extent equals a second predetermined value, freeing the donor extent. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computerized system comprising:
-
a processor coupled to a memory through a bus; and instructions executed from the memory by the processor to cause the processor to receive a request to remove duplicate data in the computerized system; access a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and match an entry in the log data container with another entry in the log data container determine a donor extent and a recipient extent, and upon determining an external reference count associated with the recipient extent equals a first predetermined value, performing block sharing for the donor extent and the recipient extent, and upon determining the reference count of the donor extent equals a second predetermined value, freeing the donor extent. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A computerized system comprising:
-
a storage server coupled to a storage device, the storage server operative to; receive a request to remove duplicate data in a storage server; access a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and match an entry in the log data container with another entry in the log data container determine a donor extent and a recipient extent, and upon determining an external reference count associated with the recipient extent equals a first predetermined value, performing block sharing for the donor extent and the recipient extent, and upon determining if the reference count of the donor extent equals a second predetermined value, freeing the donor extent.
-
-
23. A computerized method comprising:
-
receiving, by a storage server, a request to remove duplicate data in the storage server; accessing a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and matching an extent identifier associated with an entry to an extent identifier associated with another entry in the log data container, updating a reference count and a pointer identifier of the extent identifier associated with the entry, updating a reference count and a pointer identifier of the extent identifier associated with the another entry, freeing the extent identified by the extent identifier associated with the entry when the reference count of the extent identifier associated with the entry equals the predetermined value, and freeing the extent identified by the extent identifier associated with the another entry when the reference count of the extent identifier associated with the another entry equals the predetermined value.
-
-
24. A non-transitory computer-readable storage medium embodied with executable instructions that cause a processor to perform operations comprising:
-
receiving a request to remove duplicate data in a storage server; accessing a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and matching an extent identifier associated with an entry to an extent identifier associated with another entry in the log data container, updating a reference count and a pointer identifier of the extent identifier associated with the entry, updating a reference count and a pointer identifier of the extent identifier associated with the another entry, freeing the extent identified by the extent identifier associated with the entry when the reference count of the extent identifier associated with the entry equals the predetermined value, and freeing the extent identified by the extent identifier associated with the another entry when the reference count of the extent identifier associated with the another entry equals the predetermined value.
-
-
25. A computerized system comprising:
-
a processor coupled to a memory through a bus; and instructions executed from the memory by the processor to cause the processor to receive a request to remove duplicate data in a storage server; access a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and if match an extent identifier associated with an entry to an extent identifier associated with another entry in the log data container, update a reference count and a pointer identifier of the extent identifier associated with the entry, update a reference count and a pointer identifier of the extent identifier associated with the another entry, free the extent identified by the extent identifier associated with the entry when the reference count of the extent identifier associated with the entry equals the predetermined value, and free the extent identified by the extent identifier associated with the another entry when the reference count of the extent identifier associated with the another entry equals the predetermined value.
-
-
26. A computerized system comprising:
-
a storage server coupled to a storage device, the storage server operative to; receive a request to remove duplicate data in a storage server; access a log data container associated with a storage volume of the storage server, the log data container including a plurality of entries, wherein each entry is identified by an extent identifier in a data structure stored in a volume associated with the storage server; and match an extent identifier associated with an entry to an extent identifier associated with another entry in the log data container, update a reference count and a pointer identifier of the extent identifier associated with the entry, update a reference count and a pointer identifier of the extent identifier associated with the another entry, free the extent identified by the extent identifier associated with the entry when the reference count of the extent identifier associated with the entry equals the predetermined value, and free the extent identified by the extent identifier associated with the another entry when the reference count of the extent identifier associated with the another entry equals the predetermined value.
-
Specification