MAIN-MEMORY DATABASE CHECKPOINTING
First Claim
1. At a computer system, the computer system including one or more processors, system memory, and durable storage, the computer system maintaining an in-memory database in system memory, a method for updating a checkpoint for the in-memory database, the method comprising:
- committing a transaction, the results of the transaction modifying the content of the in-memory database, the transaction having a timestamp, the timestamp indicating an associated time the transaction was committed relative to other transactions;
generating checkpoint data for the transaction from the results of the transaction, the checkpoint data including versions of one or more inserted portions of data inserted into the in-memory database and including identifiers for one or more deleted portions of data deleted from the in-memory database;
appending the checkpoint data to a checkpoint, including;
determining that the timestamp is within a specified timestamp range for a data file, the data file configured to store any inserted portions of data inserted into the in-memory database within the specified timestamp range;
appending the one or more inserted portions of data to the data file;
for each of the one or more deleted portions of data;
identifying a corresponding insert operation that inserted the deleted portion of data into the in-memory database;
locating a timestamp for a transaction that included the corresponding insert operation;
determining that the located timestamp is within a second specified time range for a delta file;
appending the identifier for the deleted portion of data to the delta file, the delta file configured to store identifiers for any deleted portions of data deleted from the in-memory database during the second specified time range.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention extends to methods, systems, and computer program products for main-memory database checkpointing. Embodiments of the invention use a transaction log as an interface between online threads and a checkpoint subsystem. Using the transaction log as an interface reduces synchronization overhead between threads and the checkpoint subsystem. Transactions can be assigned to files and storage space can be reserved in a lock free manner to reduce overhead of checkpointing online transactions. Meta-data independent data files and delta files can be collapsed and merged to reduce storage overhead. Checkpoints can be updated incrementally such that changes made since the last checkpoint (and not all data) are flushed to disk. Checkpoint I/O is sequential, helping ensure higher performance of physical I/O layers. During recovery checkpoint files can be loaded into memory in parallel for multiple devices.
-
Citations
20 Claims
-
1. At a computer system, the computer system including one or more processors, system memory, and durable storage, the computer system maintaining an in-memory database in system memory, a method for updating a checkpoint for the in-memory database, the method comprising:
-
committing a transaction, the results of the transaction modifying the content of the in-memory database, the transaction having a timestamp, the timestamp indicating an associated time the transaction was committed relative to other transactions; generating checkpoint data for the transaction from the results of the transaction, the checkpoint data including versions of one or more inserted portions of data inserted into the in-memory database and including identifiers for one or more deleted portions of data deleted from the in-memory database; appending the checkpoint data to a checkpoint, including; determining that the timestamp is within a specified timestamp range for a data file, the data file configured to store any inserted portions of data inserted into the in-memory database within the specified timestamp range; appending the one or more inserted portions of data to the data file; for each of the one or more deleted portions of data; identifying a corresponding insert operation that inserted the deleted portion of data into the in-memory database; locating a timestamp for a transaction that included the corresponding insert operation; determining that the located timestamp is within a second specified time range for a delta file; appending the identifier for the deleted portion of data to the delta file, the delta file configured to store identifiers for any deleted portions of data deleted from the in-memory database during the second specified time range. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. At a computer system, the computer system including one or more processors, system memory, and durable storage, the computer system maintaining an in-memory database in system memory, the computer system maintaining a sequential checkpoint for the in-memory database, the checkpoint including a set of temporally ordered checkpoint files, the temporally ordered checkpoint files representing the effects of one or more committed transactions on the in-memory database, the temporally ordered checkpoint files including one or more data files and one or more matched delta files, each data file in the one or more data files matched to a corresponding delta file in the one or more delta files, each matched data file and delta file assigned a timestamp range within the temporal ordering, each data file configured to store inserted portions of data inserted into the in-memory database during an assigned timestamp range, each delta file configured to store identifiers for deleted portions of data deleted from the in-memory database during an assigned timestamp range, a method for managing the storage resources consumed by the checkpoint files, the method comprising:
-
determining that the storage resources consumed by one or more data files and one or more matched delta files can be reduced based on one or more of;
the contents of the one or more data files and the one or more matched delta files and the assigned timestamp ranges for the one or more data files and one or more matched delta files;reducing the consumed storage resources for a data file by combining inserted portions of data contained in the data file with contents of at least one other checkpoint file, including one or more of; (a) collapsing the contents of the data file by; locating identifiers for deleted portions of data in the matched delta file that correspond to inserted portions of data in the data file; and removing inserted portions of data corresponding to the located identifiers from the data file; and (b) merging the data file with another data file by; merging the inserted portions of data in the data file with inserted portions of data in the other data file, the assigned timestamp range for other data file temporally adjacent to the assigned timestamp range for the data file within the temporal ordering. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 18, 19, 20)
-
-
17. At a computer system, the computer system including one or more processors, system memory, and durable storage, the computer system storing a transaction log and a checkpoint for an in-memory database in the durable storage, the checkpoint including a set of temporally ordered checkpoint files up to a specified timestamp for the in-memory database, the temporally ordered checkpoint files representing the effects of one or more committed transactions on the in-memory database, the temporally ordered checkpoint files including one or more data files and one or more matched delta files, each data file in the one or more data files matched to a corresponding delta file in the one or more delta files, each matched data file and delta file assigned a timestamp range within the temporal ordering, each data file configured to store inserted portions of data inserted into the in-memory database during an assigned timestamp range, each delta file configured to store identifiers for deleted portions of data deleted from the in-memory database during an assigned timestamp range, the transaction log including log records for one or more additional transactions that occurred after the specified time stamp, a method for reestablishing a state of the in-memory data that reflects a most recently committed transaction in the transaction log, the method comprising:
-
identifying the location of each of the one or more data files and each of the one or more delta files within the durable storage; processing each of the one or more data files, including; locating identifiers for deleted portions of data in the matched delta file that correspond to inserted portions of data in the data file; filtering the data file by skipping inserted portions of data corresponding to the located identifiers from the delta file, filtering the data file leaving unfiltered rows to be loaded into system memory; inserting the unfiltered portions of inserted data into the in-memory database; and subsequent to processing each of the one or more data files, replaying the transaction log from the specified timestamp to the end of the transaction log to realize the effects of the one or more additional transactions on the in-memory database.
-
Specification