System and method for providing high availability data
First Claim
1. A computer-implemented data storage system comprising:
- host mapping logic configured to map responsibility for storing a plurality of data sets to individual ones of a plurality of hosts which cooperate to implement a data storage system;
a hardware processor;
data set replication logic executed by the hardware processor configured to execute instructions stored in memory, the data set replication logic configured to;
obtain a first version of a data set to be written;
select a first subset of the plurality of hosts to write the first version of the data set;
write a first plurality of copies of the first version of a data set at the first subset of the plurality of hosts, wherein the first plurality of copies respectively include a version history of the first version of the data set;
obtain a second version of the data set to be written, wherein the second version of the data set comprises one or more updates to the data set inconsistent with at least a portion of the first version of the data set;
select a second subset of the plurality of hosts to write the second version of the data set, wherein the first and second subsets of the plurality of hosts include at least one different host;
write a second plurality of copies of the second version of the data set at the second subset of the plurality of hosts, wherein the second plurality of copies respectively include another version history of the second version of the data set;
data set retrieval logic executed by a hardware processor configured to execute instructions stored in memory, the data set retrieval logic configured to be responsive to a request to provide a single copy of the data set by reading a third plurality of copies of the data set at a third subset of the plurality of hosts, wherein the third plurality of copies include at least one copy of the first version of the data set and at least one copy of the second version of the data set, wherein the third subset of the plurality of hosts has at least one host in common with the first subset of the plurality of hosts and at least one host in common with the second subset of the plurality of hosts, and wherein the at least one host in common with the first subset of the plurality of hosts is not a member of the second subset of the plurality of hosts; and
an evaluation component configured to provide a single copy of the data set by;
reading the third plurality of copies of the data set; and
evaluating the version history and the other version history to reconcile the first version of the data set and the second version of the data set read from the third plurality of copies into the single copy of the data set;
wherein the evaluation component is configured to be invoked after the third plurality of copies of the data set is read.
0 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented data processing system and method writes a first plurality of copies of a data set at a first plurality of hosts and reads a second plurality of copies of the data set at a second plurality of hosts. The first and second pluralities of copies may be overlapping and the first and second pluralities of hosts may be overlapping. A hashing function may be used to select the first and second pluralities of hosts. Version histories for each of the first copies of the data set may also be written at the first plurality of hosts and read at the second plurality of hosts. The version histories for the second copies of the data set may be compared and causal between the second copies of the data set may be evaluated based on the version histories for the second copies of the data set.
-
Citations
41 Claims
-
1. A computer-implemented data storage system comprising:
-
host mapping logic configured to map responsibility for storing a plurality of data sets to individual ones of a plurality of hosts which cooperate to implement a data storage system; a hardware processor; data set replication logic executed by the hardware processor configured to execute instructions stored in memory, the data set replication logic configured to; obtain a first version of a data set to be written; select a first subset of the plurality of hosts to write the first version of the data set; write a first plurality of copies of the first version of a data set at the first subset of the plurality of hosts, wherein the first plurality of copies respectively include a version history of the first version of the data set; obtain a second version of the data set to be written, wherein the second version of the data set comprises one or more updates to the data set inconsistent with at least a portion of the first version of the data set; select a second subset of the plurality of hosts to write the second version of the data set, wherein the first and second subsets of the plurality of hosts include at least one different host; write a second plurality of copies of the second version of the data set at the second subset of the plurality of hosts, wherein the second plurality of copies respectively include another version history of the second version of the data set; data set retrieval logic executed by a hardware processor configured to execute instructions stored in memory, the data set retrieval logic configured to be responsive to a request to provide a single copy of the data set by reading a third plurality of copies of the data set at a third subset of the plurality of hosts, wherein the third plurality of copies include at least one copy of the first version of the data set and at least one copy of the second version of the data set, wherein the third subset of the plurality of hosts has at least one host in common with the first subset of the plurality of hosts and at least one host in common with the second subset of the plurality of hosts, and wherein the at least one host in common with the first subset of the plurality of hosts is not a member of the second subset of the plurality of hosts; and an evaluation component configured to provide a single copy of the data set by; reading the third plurality of copies of the data set; and evaluating the version history and the other version history to reconcile the first version of the data set and the second version of the data set read from the third plurality of copies into the single copy of the data set; wherein the evaluation component is configured to be invoked after the third plurality of copies of the data set is read. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented data processing method comprising:
-
writing a first plurality of copies of a first version of a data set at a first plurality of hosts, including writing a version history for each of the first plurality of copies of the first version of the data set; writing a second plurality of copies of a second version of the data set at a second plurality of hosts, including writing another version history for each of the second plurality of copies of the second version of the data set, wherein the second version of the data set comprises one or more updates to the data set that is inconsistent with the first version of the data set and wherein the first and second plurality of hosts include at least one different host; responding to a request to provide a single copy of the data set by reading a third plurality of copies of the data set at a third plurality of hosts, wherein the third plurality of copies include at least one copy of the first version of the data set and at least one copy of the second version of the data set, wherein the third plurality of hosts has at least one host in common with the first plurality of hosts and at least one host in common with the second plurality of hosts, and wherein the at least one host in common with the first plurality of hosts is not a member of the second plurality of hosts; reconciling the first version of the data set and the second version of the data set into the single copy of the data set according to an evaluation of the version history and the other version history read from the third plurality of copies of the data set; and providing the single copy of the data set. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. A computer-implemented data processing method comprising:
-
generating a hash value based on a hash key and a hash function, the hash key being associated with a data set and being applied as input to the hash function; writing a first plurality of copies of a first version of the data set at a first subset of a plurality of hosts, including writing a version history for each of the first plurality of copies of the first version of the data set, wherein the first subset of the plurality of hosts being selected to write the data set based on the hash value; writing a second plurality of copies of a second version of the data set at a second subset of the plurality of hosts, including writing another version history for each of the second plurality of copies of the second version of the data set, wherein the second version of the data set comprises one or more updates to the data set that is inconsistent with the first version of the data set and wherein the first and second subsets of the plurality of hosts include at least one different host; responsive to a request to recall a copy of the data set, reading a third plurality of copies of the data set at a third subset of the plurality of hosts, wherein the third plurality of copies include at least one copy of the first version of the data set and at least one copy of the second version of the data set, wherein the third subset of the plurality of hosts has at least one host in common with the first subset of the plurality of hosts and at least one host in common with the second subset of the plurality of hosts, wherein the at least one host in common with the first subset of the plurality of hosts is not a member of the second subset of the plurality of hosts, and wherein the first subset of the plurality of hosts for writing the data set and the third subset of the plurality of hosts for reading the data set are independently determined; and after reading, reconciling the first version of the data set and the second version of the data set into the single copy of the data set according to an evaluation of the version history and the other version history read from the third plurality of copies of the data set. - View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41)
-
Specification