Managing deduplicated data
First Claim
1. A method comprising:
- generating a map entry in mapping metadata to map a data unit to a physical storage location, wherein the data unit is a deduplicated data unit and corresponds to a deduplicated entry in deduplication metadata, the mapping metadata and the deduplication metadata are different data structures, and a non-deletion state is set in the deduplication metadata for the data unit when the map entry exists in the mapping metadata for the data unit; and
performing a scan of the mapping metadata for the data unit, wherein an in-use state is set in the deduplication metadata for the data unit when the map entry is detected for the data unit in the mapping metadata, a not-in-use state and a deletion state are set for the data unit in the deduplication metadata when the map entry is not detected in the mapping metadata for the data unit, and the data unit is deleted from the physical storage location based at least in part on a result of the scan when the data unit is determined to correspond to the not-in-use state and the deletion state.
1 Assignment
0 Petitions
Accused Products
Abstract
Facilitating deduplication of data in a computing system without managing access to reference count variables. A method embodiment commences upon detecting first data unit and calculating a first checksum value. At a later time, a second data unit is received and the system calculates a second checksum value. If the second checksum value is the same as the first checksum value, then the first data unit and the second data unit are the same data and need not be duplicated. In such cases, an entry in the metadata points to the location of the first data unit that is already stored. Additional metadata entries are made in the metadata to associate a Boolean usage state flag and a Boolean deletion state flag with the second checksum value. Periodically scans of the metadata are performed. When both Boolean flags are in a particular state, the deduplicated data is deleted.
36 Citations
20 Claims
-
1. A method comprising:
-
generating a map entry in mapping metadata to map a data unit to a physical storage location, wherein the data unit is a deduplicated data unit and corresponds to a deduplicated entry in deduplication metadata, the mapping metadata and the deduplication metadata are different data structures, and a non-deletion state is set in the deduplication metadata for the data unit when the map entry exists in the mapping metadata for the data unit; and performing a scan of the mapping metadata for the data unit, wherein an in-use state is set in the deduplication metadata for the data unit when the map entry is detected for the data unit in the mapping metadata, a not-in-use state and a deletion state are set for the data unit in the deduplication metadata when the map entry is not detected in the mapping metadata for the data unit, and the data unit is deleted from the physical storage location based at least in part on a result of the scan when the data unit is determined to correspond to the not-in-use state and the deletion state. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer program, embodied in a non-transitory computer readable medium, the non-transitory computer readable medium having stored thereon a sequence of instructions which, when stored in memory and executed by one or more processors, causes the one or more processors to perform a set of acts, the set of acts comprising:
-
generating a map entry in a mapping metadata to map a data unit to a physical storage location, wherein the data unit is a deduplicated data unit and corresponds to a deduplicated entry in deduplication metadata, the mapping metadata and the deduplication metadata are different data structures, and a non-deletion state is set in the deduplication metadata for the data unit when the map entry exists in the mapping metadata for the data unit; and performing a scan of the mapping metadata for the data unit, wherein an in-use state is set in the deduplication metadata for the data unit when the map entry is detected for the data unit in the mapping metadata, a not-in-use state and a deletion state are set for the data unit in the deduplication metadata when the map entry is not detected in the mapping metadata for the data unit, and the data unit is deleted from the physical storage location based at least in part on a result of the scan when the data unit is determined to correspond to the not-in-use state and the delete state. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system comprising:
-
a non-transitory storage medium having stored thereon a sequence of instructions; and one or more processors that execute the sequence of instructions to cause the one or more processors to perform a set of acts, the set of acts comprising; generating a map entry in a mapping metadata to map a data unit to a physical storage location, wherein the data unit is a deduplicated data unit and corresponds to a deduplicated entry in deduplication metadata, the mapping metadata and the deduplication metadata are different data structures, and a non-deletion state is set in the deduplication metadata for the data unit when the map entry exists in the mapping metadata for the data unit; and performing a scan of the mapping metadata for the data unit, wherein an in-use state is set in the deduplication metadata for the data unit when the map entry is detected for the data unit in the mapping metadata, a not-in-use state and a deletion state are set for the data unit in the deduplication metadata when the map entry is not detected in the mapping metadata for the data unit, and the data unit is deleted from the physical storage location based at least in part on a result of the scan when the data unit is determined to correspond to the not-in-use state and the deletion state. - View Dependent Claims (20)
-
Specification