Managing data deduplication in storage systems
First Claim
Patent Images
1. A method for use in managing data deduplication in storage systems, the method comprising:
- performing a data deduplication process by applying a deduplicating technique to data of a set of deduplication domains, wherein a set of data deduplication processes is scheduled for the set of deduplication domains, wherein each data deduplication process of the set of data deduplication processes iterates over data of a deduplication domain of the set of deduplication domains based on a respective priority associated with each deduplication domain, wherein each data deduplication process of the set of data deduplication processes iterates over data concurrently with other data deduplication processes, wherein a priority associated with a deduplication domain indicates an amount of time after which a next iteration is scheduled for the deduplication domain;
evaluating characteristics of data deduplication performed on each deduplication domain of the set of deduplication domains by the respective data deduplication process during a previous iteration, wherein evaluating the characteristics of data deduplication includes determining the rate at which data of each deduplication domain is deduplicated and a probability of determining duplicate data blocks in each deduplication domain during a next iteration; and
based on the evaluation, effecting execution of the respective data deduplication process for each deduplication domain, wherein effecting execution of the respective data deduplication process for each deduplication domain includes updating the respective priority associated with each deduplication domain for scheduling the next iteration, wherein updating a priority associated with a deduplication domain includes changing an amount of time after which a next iteration is scheduled for the deduplication domain.
9 Assignments
0 Petitions
Accused Products
Abstract
A method is used in managing data deduplication in storage systems. A data deduplication process is performed by applying a deduplicating technique to data of a deduplication domain. The data deduplication process is scheduled based on a priority. Characteristics of data deduplication performed by the data deduplication process are evaluated. Based on the evaluation, execution of the data deduplication process is effected.
-
Citations
16 Claims
-
1. A method for use in managing data deduplication in storage systems, the method comprising:
-
performing a data deduplication process by applying a deduplicating technique to data of a set of deduplication domains, wherein a set of data deduplication processes is scheduled for the set of deduplication domains, wherein each data deduplication process of the set of data deduplication processes iterates over data of a deduplication domain of the set of deduplication domains based on a respective priority associated with each deduplication domain, wherein each data deduplication process of the set of data deduplication processes iterates over data concurrently with other data deduplication processes, wherein a priority associated with a deduplication domain indicates an amount of time after which a next iteration is scheduled for the deduplication domain; evaluating characteristics of data deduplication performed on each deduplication domain of the set of deduplication domains by the respective data deduplication process during a previous iteration, wherein evaluating the characteristics of data deduplication includes determining the rate at which data of each deduplication domain is deduplicated and a probability of determining duplicate data blocks in each deduplication domain during a next iteration; and based on the evaluation, effecting execution of the respective data deduplication process for each deduplication domain, wherein effecting execution of the respective data deduplication process for each deduplication domain includes updating the respective priority associated with each deduplication domain for scheduling the next iteration, wherein updating a priority associated with a deduplication domain includes changing an amount of time after which a next iteration is scheduled for the deduplication domain. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for use in managing data deduplication in storage systems, the system comprising:
-
a processor configured to; perform a data deduplication process by applying a deduplicating technique to data of a set of deduplication domains, wherein a set of data deduplication processes is scheduled for the set of deduplication domains, wherein each data deduplication process of the set of data deduplication processes iterates over data of a deduplication domain of the set of deduplication domains based on a respective priority associated with each deduplication domain, wherein each data deduplication process of the set of data deduplication processes iterates over data concurrently with other data deduplication processes, wherein a priority associated with a deduplication domain indicates an amount of time after which a next iteration is scheduled for the deduplication domain; evaluate characteristics of data deduplication performed on each deduplication domain of the set of deduplication domains by the respective data deduplication process during a previous iteration, wherein evaluating the characteristics of data deduplication includes determining the rate at which data of each deduplication domain is deduplicated and a probability of determining duplicate data blocks in each deduplication domain during a next iteration; and effect, based on the evaluation, execution of the respective data deduplication process for each deduplication domain, wherein effecting execution of the respective data deduplication process for each deduplication domain includes updating the respective priority associated with each deduplication domain for scheduling the next iteration, wherein updating a priority associated with a deduplication domain includes changing an amount of time after which a next iteration is scheduled for the deduplication domain. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification