Limiting deduplication based on predetermined criteria
First Claim
Patent Images
1. A method, comprising:
- receiving, by a processor, data for deduplication, wherein a deduplication ratio is a factor by which storage requirements of the received data are to be reduced and a data deduplication threshold is a selected amount of the received data that is deduplicated to determine whether the deduplication ratio is achievable for the received data, and wherein the deduplication ratio is set such that an attempt is made to reduce storage requirements of the data by at least a factor of 20;
determining whether the received data the received data has been quiescent at least for a period of time indicated in a data quiescence measure, wherein the period of time indicated in the data quiescence measure is at least a plurality of days;
in response to determining that the received data has been quiescent for at least the period of time indicated in the data quiescence measure, performing;
deduplicating the selected amount of the received data to generate an amount of deduplicated data;
determining whether the generated amount of deduplicated data exceeds the data deduplication threshold, wherein the data duplication threshold is set to be at least 100 gigabytes;
in response to determining that the generated amount of deduplicated data exceeds the data deduplication threshold, determining whether the generated amount of deduplicated data has achieved the deduplication ratio; and
in response to determining that the generated amount of deduplicated data has not achieved the deduplication ratio, discontinuing the deduplicating of the received data and switching to a different set of data for deduplication; and
in response to determining that the received data has not been quiescent for at least the period of time indicated in the data quiescence measure, receiving additional data for deduplication, and wherein deduplication of the data is abandoned when user specified deduplication parameters including the deduplication ratio, the data quiescence measure, and the data duplication threshold are not satisfied.
1 Assignment
0 Petitions
Accused Products
Abstract
Data for deduplication is received. The received data is deduplicated if selected conditions corresponding to the deduplication are satisfied, wherein the selected conditions include a deduplication ratio, a data deduplication threshold, and a data quiescence measure. Deduplication of the received data is discontinued if the selected conditions corresponding to the deduplication are not satisfied.
-
Citations
16 Claims
-
1. A method, comprising:
-
receiving, by a processor, data for deduplication, wherein a deduplication ratio is a factor by which storage requirements of the received data are to be reduced and a data deduplication threshold is a selected amount of the received data that is deduplicated to determine whether the deduplication ratio is achievable for the received data, and wherein the deduplication ratio is set such that an attempt is made to reduce storage requirements of the data by at least a factor of 20; determining whether the received data the received data has been quiescent at least for a period of time indicated in a data quiescence measure, wherein the period of time indicated in the data quiescence measure is at least a plurality of days; in response to determining that the received data has been quiescent for at least the period of time indicated in the data quiescence measure, performing; deduplicating the selected amount of the received data to generate an amount of deduplicated data; determining whether the generated amount of deduplicated data exceeds the data deduplication threshold, wherein the data duplication threshold is set to be at least 100 gigabytes; in response to determining that the generated amount of deduplicated data exceeds the data deduplication threshold, determining whether the generated amount of deduplicated data has achieved the deduplication ratio; and in response to determining that the generated amount of deduplicated data has not achieved the deduplication ratio, discontinuing the deduplicating of the received data and switching to a different set of data for deduplication; and in response to determining that the received data has not been quiescent for at least the period of time indicated in the data quiescence measure, receiving additional data for deduplication, and wherein deduplication of the data is abandoned when user specified deduplication parameters including the deduplication ratio, the data quiescence measure, and the data duplication threshold are not satisfied. - View Dependent Claims (2, 9, 13)
-
-
3. A system, comprising:
-
a memory; and a processor coupled to the memory, wherein the processor performs operations, the operations comprising; receiving data for deduplication, wherein a deduplication ratio is a factor by which storage requirements of the received data are to be reduced and a data deduplication threshold is a selected amount of the received data that is deduplicated to determine whether the deduplication ratio is achievable for the received data, and wherein the deduplication ratio is set such that an attempt is made to reduce storage requirements of the data by at least a factor of 20; determining whether the received data the received data has been quiescent at least for a period of time indicated in a data quiescence measure, wherein the period of time indicated in the data quiescence measure is at least a plurality of days; in response to determining that the received data has been quiescent for at least the period of time indicated in the data quiescence measure, performing; deduplicating the selected amount of the received data to generate an amount of deduplicated data; determining whether the generated amount of deduplicated data exceeds the data deduplication threshold, wherein the data duplication threshold is set to be at least 100 gigabytes; in response to determining that the generated amount of deduplicated data exceeds the data deduplication threshold, determining whether the generated amount of deduplicated data has achieved the deduplication ratio; and in response to determining that the generated amount of deduplicated data has not achieved the deduplication ratio, discontinuing the deduplicating of the received data and switching to a different set of data for deduplication; and in response to determining that the received data has not been quiescent for at least the period of time indicated in the data quiescence measure, receiving additional data for deduplication, and wherein deduplication of the data is abandoned when user specified deduplication parameters including the deduplication ratio, the data quiescence measure, and the data duplication threshold are not satisfied. - View Dependent Claims (4, 10, 14)
-
-
5. An article of manufacture including code, wherein the code when executed on a processor performs operations, the operations comprising:
-
receiving, by the processor, data for deduplication, wherein a deduplication ratio is a factor by which storage requirements of the received data are to be reduced and a data deduplication threshold is a selected amount of the received data that is deduplicated to determine whether the deduplication ratio is achievable for the received data, and wherein the deduplication ratio is set such that an attempt is made to reduce storage requirements of the data by at least a factor of 20; determining, by the processor, whether the received data the received data has been quiescent at least for a period of time indicated in a data quiescence measure, wherein the period of time indicated in the data quiescence measure is at least a plurality of days; in response to determining, by the processor, that the received data has been quiescent for at least the period of time indicated in the data quiescence measure, performing; deduplicating the selected amount of the received data to generate an amount of deduplicated data; determining whether the generated amount of deduplicated data exceeds the data deduplication threshold, wherein the data duplication threshold is set to be at least 100 gigabytes; in response to determining that the generated amount of deduplicated data exceeds the data deduplication threshold, determining whether the generated amount of deduplicated data has achieved the deduplication ratio; and in response to determining that the generated amount of deduplicated data has not achieved the deduplication ratio, discontinuing the deduplicating of the received data and switching to a different set of data for deduplication; and in response to determining, by the processor, that the received data has not been quiescent for at least the period of time indicated in the data quiescence measure, receiving additional data for deduplication, and wherein deduplication of the data is abandoned when user specified deduplication parameters including the deduplication ratio, the data quiescence measure, and the data duplication threshold are not satisfied. - View Dependent Claims (6, 11, 15)
-
-
7. A method for deploying computing infrastructure, comprising integrating machine-readable code into a machine, wherein the code in combination with the machine is capable of performing:
-
receiving data for deduplication, wherein a deduplication ratio is a factor by which storage requirements of the received data are to be reduced and a data deduplication threshold is a selected amount of the received data that is deduplicated to determine whether the deduplication ratio is achievable for the received data, and wherein the deduplication ratio is set such that an attempt is made to reduce storage requirements of the data by at least a factor of 20; determining whether the received data the received data has been quiescent at least for a period of time indicated in a data quiescence measure, wherein the period of time indicated in the data quiescence measure is at least a plurality of days; in response to determining that the received data has been quiescent for at least the period of time indicated in the data quiescence measure, performing; deduplicating the selected amount of the received data to generate an amount of deduplicated data; determining whether the generated amount of deduplicated data exceeds the data deduplication threshold, wherein the data duplication threshold is set to be at least 100 gigabytes; in response to determining that the generated amount of deduplicated data exceeds the data deduplication threshold, determining whether the generated amount of deduplicated data has achieved the deduplication ratio; and in response to determining that the generated amount of deduplicated data has not achieved the deduplication ratio, discontinuing the deduplicating of the received data and switching to a different set of data for deduplication; and in response to determining that the received data has not been quiescent for at least the period of time indicated in the data quiescence measure, receiving additional data for deduplication, and wherein deduplication of the data is abandoned when user specified deduplication parameters including the deduplication ratio, the data quiescence measure, and the data duplication threshold are not satisfied. - View Dependent Claims (8, 12, 16)
-
Specification