SYSTEM AND METHOD FOR BUILDING A POINT-IN-TIME SNAPSHOT OF AN EVENTUALLY-CONSISTENT DATA STORE
First Claim
1. A computer-implemented method for building a point-in-time snapshot of an eventually-consistent data store distributed among a plurality of nodes connected by a network, the method comprising:
- receiving a plurality of inconsistent snapshots, wherein each inconsistent snapshot includes one or more rows of key-value pairs associated with the data store and reflects the contents of at least a portion of the data store stored on a particular node of the plurality of nodes; and
generating the point-in-time snapshot by resolving the rows of key-value pairs to remove any inconsistent values, wherein the point-in-time snapshot includes a subset of the key-value pairs included in the plurality of inconsistent snapshots.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for building a point-in-time snapshot of an eventually-consistent data store. The data store includes key-value pairs stored on a plurality of storage nodes. In one embodiment, the data store is implemented as an Apache® Cassandra database running in the “cloud.” The data store includes a journaling mechanism that stores journals (i.e., inconsistent snapshots) of the data store on each node at various intervals. In Cassandra, these snapshots are sorted string tables that may be copied to a back-up storage location. A cluster of processing nodes may retrieve and resolve the inconsistent snapshots to generate a point-in-time snapshot of the data store corresponding to a lagging consistency point. In addition, the point-in-time snapshot may be updated as any new inconsistent snapshots are generated by the data store such that the lagging consistency point associated with the updated point-in-time snapshot is more recent.
-
Citations
20 Claims
-
1. A computer-implemented method for building a point-in-time snapshot of an eventually-consistent data store distributed among a plurality of nodes connected by a network, the method comprising:
-
receiving a plurality of inconsistent snapshots, wherein each inconsistent snapshot includes one or more rows of key-value pairs associated with the data store and reflects the contents of at least a portion of the data store stored on a particular node of the plurality of nodes; and generating the point-in-time snapshot by resolving the rows of key-value pairs to remove any inconsistent values, wherein the point-in-time snapshot includes a subset of the key-value pairs included in the plurality of inconsistent snapshots. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for building a point-in-time snapshot of an eventually-consistent data store, comprising:
-
a plurality of nodes connected by a network and storing the data store; and a processing node connected to the data store via the network and configured to; receive a plurality of inconsistent snapshots, wherein each inconsistent snapshot includes one or more rows of key-value pairs associated with the data store and reflects the contents of at least a portion of the data store stored on a particular node of the plurality of nodes, and generate the point-in-time snapshot by resolving the rows of key-value pairs to remove any inconsistent values, wherein the point-in-time snapshot includes a subset of the key-value pairs included in the plurality of inconsistent snapshots. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-readable storage medium including instructions that, when executed by a processing unit, cause the processing unit to perform an operation for building a point-in-time snapshot of an eventually-consistent data store, the operation comprising:
-
receiving a plurality of inconsistent snapshots, wherein each inconsistent snapshot includes one or more rows of key-value pairs associated with the data store and reflects the contents of at least a portion of the data store stored on a particular node of the plurality of nodes; generating a sorted table including each row of key-value pairs from the plurality of inconsistent snapshots; and generating the point-in-time snapshot by resolving the rows of key-value pairs in the sorted table to remove any inconsistent values, wherein the point-in-time snapshot includes a subset of the rows of key-value pairs included in the sorted table such that each unique key is associated with a single row that is selected from all rows in the sorted table associated with that particular key. - View Dependent Claims (18, 19, 20)
-
Specification