Performance of asynchronous replication in HSM integrated storage systems
First Claim
1. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a computing device, causes the computing device to:
- for a given file to be replicated from a primary storage system to a remote storage system, access the remote storage system to determine file existence and migration status at the remote storage system for the given file;
responsive to the primary storage system determining that the given file exists at the remote storage system and has been migrated from first tier storage to second tier storage at the remote storage system, determine a first performance penalty for recall of the given file from the second tier storage to the first tier storage at the remote storage system and a second performance penalty for sending whole file data of the given file from the primary storage system to the remote storage system, wherein the primary storage system determines the first performance penalty as follows;
((T_mnt+T_seek+S_fold/R_taperd)+(T_hashpri*(S_fnew/S_hashblk))+(T_hashrem*(S_fold/S_hashblk))+((S_hash*(S_fold/S_hashblk)/R_net))),wherein T_mnt represents a time it would take to mount the second tier storage at the remote site, wherein T_seek represents a time it would take to seek the given file in the second tier storage, wherein S_fold represents a size of the given file in the second tier storage, wherein R_taperd represents a read throughput of the second tier storage, wherein T_hashpri represents an average time to calculate a hash of a block size at the primary storage system, wherein S_fnew represents an entire file size of the given file at the primary storage system, wherein S_hashblk represents block size of the given file at the primary storage system, wherein T_hashrem represents average time to calculate a hash of a block size at the remote storage system, wherein S_hash represents size of the calculated hash, and wherein R_net represents network transfer throughput obtained from past replications; and
responsive to the primary storage system determining that the first performance penalty is greater than the second performance penalty, send whole file data for the given file from the primary storage system to the remote storage system to replicate the given file at the remote storage system.
2 Assignments
0 Petitions
Accused Products
Abstract
A mechanism is provided in a data processing system for asynchronous replication in a hierarchical storage management integrated storage system. For a given file to be replicated from a primary storage system to a remote storage system, the primary storage system accesses the remote storage system to determine file existence and migration status at the remote storage system for the given file. Responsive to the primary storage system determining that the given file exists and has been migrated from first tier storage to second tier storage at the remote storage system, the primary storage system determines a first performance penalty for file recall and a second performance penalty for sending excess data from the primary storage system to the remote storage system. Responsive to the primary storage system determining that the first performance penalty is greater than the second performance penalty, the primary storage system sends whole file data for the given file to the remote storage system to replicate the given file at the remote storage system.
21 Citations
19 Claims
-
1. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a computing device, causes the computing device to:
-
for a given file to be replicated from a primary storage system to a remote storage system, access the remote storage system to determine file existence and migration status at the remote storage system for the given file; responsive to the primary storage system determining that the given file exists at the remote storage system and has been migrated from first tier storage to second tier storage at the remote storage system, determine a first performance penalty for recall of the given file from the second tier storage to the first tier storage at the remote storage system and a second performance penalty for sending whole file data of the given file from the primary storage system to the remote storage system, wherein the primary storage system determines the first performance penalty as follows;
((T_mnt+T_seek+S_fold/R_taperd)+(T_hashpri*(S_fnew/S_hashblk))+(T_hashrem*(S_fold/S_hashblk))+((S_hash*(S_fold/S_hashblk)/R_net))),wherein T_mnt represents a time it would take to mount the second tier storage at the remote site, wherein T_seek represents a time it would take to seek the given file in the second tier storage, wherein S_fold represents a size of the given file in the second tier storage, wherein R_taperd represents a read throughput of the second tier storage, wherein T_hashpri represents an average time to calculate a hash of a block size at the primary storage system, wherein S_fnew represents an entire file size of the given file at the primary storage system, wherein S_hashblk represents block size of the given file at the primary storage system, wherein T_hashrem represents average time to calculate a hash of a block size at the remote storage system, wherein S_hash represents size of the calculated hash, and wherein R_net represents network transfer throughput obtained from past replications; and responsive to the primary storage system determining that the first performance penalty is greater than the second performance penalty, send whole file data for the given file from the primary storage system to the remote storage system to replicate the given file at the remote storage system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method, in a data processing system, for asynchronous replication in a hierarchical storage management integrated storage system, the method comprising:
-
for a given file to be replicated from a primary storage system to a remote storage system, accessing the remote storage system to determine file existence and migration status at the remote storage system for the given file; responsive to the primary storage system determining that the given file exists at the remote storage system and has been migrated from first tier storage to second tier storage at the remote storage system, determining a first performance penalty for recall of the given file from the second tier storage to the first tier storage at the remote storage system and a second performance penalty for sending whole file data of the given file from the primary storage system to the remote storage system, wherein the primary storage system determines the first performance penalty as follows;
((T_mnt+T_seek+S_fold/R_taperd)+(T_hashpri*(S_fnew/S_hashblk))+(T_hashrem*(S_fold/S_hashblk))+((S_hash*(S_fold/S_hashblk)/R_net))),wherein T_mnt represents a time it would take to mount the second tier storage at the remote site, wherein T_seek represents a time it would take to seek the given file in the second tier storage, wherein S_fold represents a size of the given file in the second tier storage, wherein R_taperd represents a read throughput of the second tier storage, wherein T_hashpri represents an average time to calculate a hash of a block size at the primary storage system, wherein S_fnew represents an entire file size of the given file at the primary storage system, wherein S_hashblk represents block size of the given file at the primary storage system, wherein T_hashrem represents average time to calculate a hash of a block size at the remote storage system, wherein S_hash represents size of the calculated hash, and wherein R_net represents network transfer throughput obtained from past replications; and responsive to the primary storage system determining that the first performance penalty is greater than the second performance penalty, sending whole file data for the given file from the primary storage system to the remote storage system to replicate the given file at the remote storage system. - View Dependent Claims (12, 13, 14, 15)
-
-
16. An apparatus comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; for a given file to be replicated from a primary storage system to a remote storage system, access the remote storage system to determine file existence and migration status at the remote storage system for the given file; responsive to the primary storage system determining that the given file exists at the remote storage system and has been migrated from first tier storage to second tier storage at the remote storage system, determine a first performance penalty for recall of the given file from the second tier storage to the first tier storage at the remote storage system and a second performance penalty for sending whole file data of the given file from the primary storage system to the remote storage system, wherein the primary storage system determines the first performance penalty as follows;
((T_mnt+T_seek+S_fold/R_taperd)+(T_hashpri*(S_fnew/S_hashblk))+(T_hashrem*(S_fold/S_hashblk))+((S_hash*(S_fold/S_hashblk)/R_net))),wherein T_mnt represents a time it would take to mount the second tier storage at the remote site, wherein T_seek represents a time it would take to seek the given file in the second tier storage, wherein S_fold represents a size of the given file in the second tier storage, wherein R_taperd represents a read throughput of the second tier storage, wherein T_hashpri represents an average time to calculate a hash of a block size at the primary storage system, wherein S_fnew represents an entire file size of the given file at the primary storage system wherein S_hashblk represents block size of the given file at the primary storage system, wherein T_hashrem represents average time to calculate a hash of a block size at the remote storage system, wherein S_hash represents size of the calculated hash, and wherein R_net represents network transfer throughput obtained from past replications; and responsive to the primary storage system determining that the first performance penalty is greater than the second performance penalty, send whole file data for the given file from the primary storage system to the remote storage system to replicate the given file at the remote storage system. - View Dependent Claims (17, 18, 19)
-
Specification