Multi-node fault-tolerant timestamp generation
First Claim
1. A method for determining a safe recovery time value after a failure of a first node in a computer system, wherein said first node is one of a plurality of nodes that has access to a database, wherein a plurality of logical clocks are associated with said plurality of nodes, the method comprising the steps of:
- prior to said failure, said first node maintaining a first logical clock of said plurality of logical clocks to assign time values to changes made to said database by said first node;
after said failure, reading a most recent log timestamp value from a log file associated with said first node;
determining, based on said most recent log timestamp value, a recovery timestamp value that is at least as recent as any time value recorded in said database by said first node prior to said failure; and
recovering said first node using said recovery timestamp value.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques for determining a safe recovery time value after a failure of a first node in a computer system are described. According to the techniques, every node in a multi-node parallel database system maintains a logical clock for generating timestamps. The logical clocks are synchronized by attaching a current timestamp to every message that is sent by a node. When a node receives an incoming timestamp that is greater than the value indicated by the associated logical clock, it sets the associated logical clock forward to at least the value of the timestamp. When a node fails, a recovery node calculates a "safe" logical clock value to use in recovering the crashed node. In calculating the "safe" logical clock value, the recovery node searches specific areas of the database to locate and recover a most recent timestamp value associated with the crashed node. The recovery node then compares its current logical clock time value with the most recent crash node timestamp value to determine which timestamp is most recent. If the most recent crash node timestamp value is more recent than the recovery node'"'"'s current logical clock time value, the recovery node'"'"'s logical clock is updated to be at least as recent as the most recent crash node timestamp value. The recovery node then recovers the crashed node as its logical clock is guaranteed to be at least as recent as any timestamp value that was previously written to the database by the crashed node prior to failure.
118 Citations
54 Claims
-
1. A method for determining a safe recovery time value after a failure of a first node in a computer system, wherein said first node is one of a plurality of nodes that has access to a database, wherein a plurality of logical clocks are associated with said plurality of nodes, the method comprising the steps of:
-
prior to said failure, said first node maintaining a first logical clock of said plurality of logical clocks to assign time values to changes made to said database by said first node; after said failure, reading a most recent log timestamp value from a log file associated with said first node; determining, based on said most recent log timestamp value, a recovery timestamp value that is at least as recent as any time value recorded in said database by said first node prior to said failure; and recovering said first node using said recovery timestamp value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-readable medium having stored thereon sequences of instructions for determining a safe recovery time value after a failure of a first node in a computer system, wherein said first node is one of a plurality of nodes that has access to a database, wherein a plurality of logical clocks are associated with said plurality of nodes, the sequences of instructions including instructions for performing the steps of:
-
prior to said failure, said first node maintaining a first logical clock of said plurality of logical clocks to assign time values to changes made to said database by said first node; after said failure, reading a most recent log timestamp value from a log file associated with said first node; determining, based on said most recent log timestamp value, a recovery timestamp value that is at least as recent as any time value recorded in said database by said first node prior to said failure; and
recovering said first node using said recovery timestamp value. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer system for determining a safe recovery time value after a failure of a first node in a computer system, wherein said first node is one of a plurality of nodes that has access to a database, wherein a plurality of logical clocks are associated with said plurality of nodes, the computer system comprising:
-
a memory; one or more processors coupled to the memory; and a set of computer instructions contained in the memory, the set of computer instructions including computer instructions which when executed by the one or more processors, cause the one or more processors to perform the steps of; prior to said failure, said first node maintaining a first logical clock of said plurality of logical clocks to assign time values to changes made to said database by said first node; after said failure, reading a most recent log timestamp value from a log file associated with said first node; determining, based on said most recent log timestamp value, a recovery timestamp value that is at least as recent as any time value recorded in said database by said first node prior to said failure; and recovering said first node using said recovery timestamp value. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A method for determining a safe recovery time value after a failure of a first node in a computer system, wherein said first node is one of a plurality of nodes that has access to a database, wherein a plurality of logical clocks are associated with said plurality of nodes, the method comprising the steps of:
-
prior to said failure, said first node maintaining a first logical clock of said plurality of logical clocks to assign time values to changes made to said database by said first node; after said failure, reading a first possibly most recent timestamp value associated with said first node; reading a second possibly most recent timestamp value associated with said first node; determining, based on said first possibly most recent timestamp value and said second possibly most recent timestamp value, a recovery timestamp value that is at least as recent as any time value recorded in said database by said first node prior to said failure; and recovering said first node using said recovery timestamp value. - View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
-
32. A computer-readable medium having stored thereon sequences of instructions for determining a safe recovery time value after a failure of a first node in a computer system, wherein said first node is one of a plurality of nodes that has access to a database, wherein a plurality of logical clocks are associated with said plurality of nodes, the sequences of instructions including instructions for performing the steps of:
-
prior to said failure, said first node maintaining a first logical clock of said plurality of logical clocks to assign time values to changes made to said database by said first node; after said failure, reading a first possibly most recent timestamp value associated with said first node; reading a second possibly most recent timestamp value associated with said first node; determining, based on said first possibly most recent timestamp value and said second possibly most recent timestamp value, a recovery timestamp value that is at least as recent as any time value recorded in said database by said first node prior to said failure; and recovering said first node using said recovery timestamp value. - View Dependent Claims (44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54)
-
Specification