Confirming data consistency in a data storage environment
First Claim
1. A method for confirming the validity of replicated data at a data storage site, the method comprising:
- a) utilizing a hash function in the form of a computer algorithm executable by a computer processor, computing a first hash value based on first data stored on a computer readable storage medium at a first data storage site, the first hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the first data;
b) utilizing the same hash function, computing a second hash value based on second data stored on a computer readable storage medium at a second data storage site remotely connected by a computer network with the first data storage site, the first data having been previously replicated over the computer network from the first data storage site to the second data storage site as the second data, and the second hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the second data;
c) transmitting at least one of the first or second hash values via the computer network for comparing with the other of the first or second hash values;
d) digitally comparing the first and second hash values to determine the likelihood as to whether the second data is a valid replication of the first data;
e) utilizing a second hash function in the form of a computer algorithm executable by a computer processor, computing a third hash value based on the first data, the third hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the first data;
f) utilizing the second hash function, computing a fourth hash value based on the second data, the fourth hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the second data; and
g) digitally comparing the third and fourth hash values to determine the likelihood as to whether the second data is a valid replication of the first data;
wherein a mismatch between the first and second hash values or between the third and fourth hash values indicates that at least one of the first or second data storage sites includes invalid data.
15 Assignments
0 Petitions
Accused Products
Abstract
A method for confirming replicated data at a data site, including utilizing a hash function, computing a first hash value based on first data at a first data site and utilizing the same hash function, computing a second hash value based on second data at a second data site, wherein the first data had previously been replicated from the first data site to the second data site as the second data. The method also includes comparing the first and second hash values to determine whether the second data is a valid replication of the first data. In additional embodiments, the first data may be modified based on seed data prior to computing the first hash value and the second data may be modified based on the same seed data prior to computing the second hash value. The process can be repeated to increase reliability of the results.
36 Citations
16 Claims
-
1. A method for confirming the validity of replicated data at a data storage site, the method comprising:
-
a) utilizing a hash function in the form of a computer algorithm executable by a computer processor, computing a first hash value based on first data stored on a computer readable storage medium at a first data storage site, the first hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the first data; b) utilizing the same hash function, computing a second hash value based on second data stored on a computer readable storage medium at a second data storage site remotely connected by a computer network with the first data storage site, the first data having been previously replicated over the computer network from the first data storage site to the second data storage site as the second data, and the second hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the second data; c) transmitting at least one of the first or second hash values via the computer network for comparing with the other of the first or second hash values; d) digitally comparing the first and second hash values to determine the likelihood as to whether the second data is a valid replication of the first data; e) utilizing a second hash function in the form of a computer algorithm executable by a computer processor, computing a third hash value based on the first data, the third hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the first data; f) utilizing the second hash function, computing a fourth hash value based on the second data, the fourth hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the second data; and g) digitally comparing the third and fourth hash values to determine the likelihood as to whether the second data is a valid replication of the first data; wherein a mismatch between the first and second hash values or between the third and fourth hash values indicates that at least one of the first or second data storage sites includes invalid data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 16)
-
-
11. An information handling system comprising:
-
a first data storage site comprising a computer readable storage medium and a computer processor, the first data storage site computing; a first hash value based on first data stored on the computer readable storage medium at the first data storage site, utilizing a first hash function in the form of a computer algorithm executable by the computer processor, the first hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the first data; and a second hash value based on the first data, utilizing a second hash function in the form of a computer algorithm executable by a computer processor, the second hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the first data; and a second data storage site remotely connected by a computer network with the first data storage site and comprising data replicated over the computer network from the first data storage site, the second data storage site computing; a third hash value based on second data stored on a computer readable storage medium at the second data storage site, utilizing the first hash function, the third hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the second data; and a fourth hash value based on the second data, utilizing the second hash function, the fourth hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the second data; wherein at least one of the first data storage site and second data storage site transmits at least one of its computed hash values via the computer network to the other of the first data storage site and second data storage site for digital comparison of the first hash value with the third hash value and digital comparison of the second hash value with the fourth hash value to determine the likelihood as to whether the second data is a valid replication of the first data, wherein a mismatch between the first and third hash values or between the second and fourth hash values indicates that at least one of the first or second data storage sites includes invalid data. - View Dependent Claims (12)
-
-
13. A method for confirming the validity of replicated data at a data storage site, the method comprising:
-
a) utilizing a hash function in the form of a computer algorithm executable by a computer processor, computing a first hash value based on a selected portion of first data stored on a computer readable storage medium at a first data storage site, the first hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the selected portion of first data; b) utilizing the same hash function, computing a second hash value based on a selected portion of second data stored on a computer readable storage medium at a second data storage site remotely connected by a computer network with the first data storage site, the first data having been previously replicated over the computer network from the first data storage site to the second data storage site as the second data, the selected portion of second data corresponding to the selected portion of first data, and the second hash value being in a digital format suitable for transmission over a computer network and being smaller in electronic data transmission size than the selected portion of second data; c) transmitting at least one of the first or second hash values via the computer network for comparing with the other of the first or second hash values; d) digitally comparing the first and second hash values to determine the likelihood as to whether the selected portion of second data is a valid replication of the selected portion of first data, wherein a mismatch between the first and second hash values indicates that at least one of the first or second data storage sites includes invalid data; and e) repeating steps a) through d) a plurality of times, each time utilizing a different selected portion of the first data and corresponding selected portion of the second data than in a previous time, the results substantially representative of the likelihood as to whether the second data as a whole is a valid replication of the first data as a whole. - View Dependent Claims (14, 15)
-
Specification