Enhanced reliability in deduplication technology over storage clouds
First Claim
Patent Images
1. A method of file deduplication implemented in a computer infrastructure comprising a combination of hardware and software, the method comprising:
- performing, by a computer processor, a file deduplication process comprising;
determining, by the computer processor, a weight for each of a plurality of duplicate files, wherein the weight is based on;
(i) parameters associated with a respective storage device of each of the plurality of duplicate files and (ii) a respective weighting factor associated with each one of the parameters; and
obtaining numerical values for the each one of the parameters and the respective weighting factors; and
designating, by the computer processor, one of the plurality of duplicate files as a master copy based on the determined weight.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for enhancing reliability in deduplication over storage clouds are provided. A method includes: determining a weight for each of a plurality of duplicate files based on parameters associated with a respective storage device of each of the plurality of duplicate files; and designating one of the plurality of duplicate files as a master copy based on the determined weight.
28 Citations
16 Claims
-
1. A method of file deduplication implemented in a computer infrastructure comprising a combination of hardware and software, the method comprising:
performing, by a computer processor, a file deduplication process comprising; determining, by the computer processor, a weight for each of a plurality of duplicate files, wherein the weight is based on;
(i) parameters associated with a respective storage device of each of the plurality of duplicate files and (ii) a respective weighting factor associated with each one of the parameters; andobtaining numerical values for the each one of the parameters and the respective weighting factors; and designating, by the computer processor, one of the plurality of duplicate files as a master copy based on the determined weight. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
11. A system comprising:
-
one or more computer processors; one or more computer readable hardware storage device; program instructions stored on the one or more computer readable hardware storage device for execution by at least one of the one or more processors, the program instructions comprising; program instructions to identify duplicate files stored at different storage devices; program instructions to determine a weight for each one of the duplicate files based on; (i) parameters associated with the storage devices and (ii) weighting factors defined for the parameters; and program instructions to designate one of the duplicate files as a master copy based on the determined weights. - View Dependent Claims (12, 13)
-
-
14. A computer program product comprising:
-
one or more computer readable hardware storage device and program instructions stored on the one or more computer readable hardware storage device, the program instructions comprising; program instructions to determine a hash value for each of a plurality of files; program instructions to determine a set of duplicate files based on the hash values; and program instructions to deduplicate the set of duplicate files, wherein the deduplicating comprises; determining a weight for each one of the duplicate files, wherein the weight is based on parameters associated with storage devices; designating a master copy of the set based on the weight of each one of the duplicate files; and nominating remaining files in the set, other than the master copy, for deletion. - View Dependent Claims (15, 16)
-
Specification