×

Method and system for efficiently replicating data in non-relational databases

  • US 8,554,724 B2
  • Filed: 08/17/2012
  • Issued: 10/08/2013
  • Est. Priority Date: 02/09/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method of compacting a distributed database having a plurality of instances, wherein each instance stores data on one or more server computers and each server computer has memory and one or more processors, the method comprising:

  • identifying a first instance of the distributed database from the plurality of instances;

    selecting a set of one or more row identifiers that identify rows in the distributed database, wherein each respective row in the distributed database has a respective base value and a respective set of zero or more respective deltas, and wherein each respective delta;

    specifies a change to the respective base value;

    includes a respective sequence identifier that specifies an order in which the respective delta is applied to the respective base value to compute a current value for the respective row; and

    specifies a respective instance where the respective delta was created;

    identifying a plurality of other instances of the distributed database, wherein the plurality of other instances are selected from the plurality of instances and each of the other instances is distinct from the first instance;

    selecting a compaction horizon for the selected set of one or more row identifiers, wherein the compaction horizon is a sequence identifier, and wherein the compaction horizon satisfies;

    all deltas that(i) were created at the first instance,(ii) are for rows corresponding to row identifiers in the selected set of one or more row identifiers, and(iii) have sequence identifiers less than or equal to the compaction horizon,have been transmitted to and acknowledged by all of the other instances that maintain data for the corresponding row identifiers; and

    all deltas that(i) were created at instances in the plurality of other instances,(ii) are for rows corresponding to row identifiers in the selected set of one or more row identifiers, and(iii) have sequence identifiers less than or equal to the compaction horizon,have been received at the first instance; and

    for each respective row identifier in at least a subset of the selected one or more row identifiers;

    identifying a non-empty set of deltas for the respective row identifier, wherein each of the deltas in the identified non-empty set has a sequence identifier less than or equal to the compaction horizon;

    applying, in sequence, each of the deltas in the identified non-empty set, to the respective base value for the respective row identifier, thereby updating the respective base value with the changes specified by the deltas in the identified set; and

    deleting the deltas in the identified non-empty set.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×