System and method for measuring consistency within a distributed storage system
First Claim
1. A computer-implemented method for evaluating the consistency of read operations performed in a distributed storage system, wherein the method comprises:
- for each of multiple write operations performed within the distributed storage system;
inserting a key identifying a replicated data object modified by the write operation into a particular memory-efficient set of multiple memory-efficient sets assigned to different time periods, wherein said particular memory-efficient set is assigned to a time period inclusive of a time at which the write operation was performed;
for each given read operation of multiple read operations performed in the distributed storage system;
determining a last-modified time specifying when a value retrieved from a given replicated data object during the given read operation was last modified;
determine a particular time period in which a most recent write operation was performed on the given replicated data object, the particular time period being a time period assigned to a memory-efficient set in which the key identifying the given replicated data object has been inserted; and
if the last-modified time is older than the particular time period, determining that the value retrieved is inconsistent with a most recent value written to the particular replicated data object wherein if the last-modified time is not older than the particular time period, the value retrieved is not determined to be inconsistent; and
generating a consistency metric based on a quantity of instances in which retrieved values are determined to be inconsistent.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments may include a consistency measurement component that utilizes memory-efficient sets (e.g., Bloom filters) assigned to different time periods for tracking when different write operations are performed on replicated data objects within a distributed data store. The consistency measurement component may evaluate whether read operations directed to the distributed data store are inconsistent. To do so, the consistency measurement component may determine, for a given read operation, the age of the value read from a given replicated data object (e.g., by evaluating a “last-modified” timestamp). The consistency measurement component may identify a memory-efficient set that includes the key of that replicated data object in order to determine when the replicated data object was last written to. If the age of the value read is older than the time at which the replicated data object was last written to, the consistency measurement component may determine that the read operation was inconsistent.
19 Citations
31 Claims
-
1. A computer-implemented method for evaluating the consistency of read operations performed in a distributed storage system, wherein the method comprises:
-
for each of multiple write operations performed within the distributed storage system;
inserting a key identifying a replicated data object modified by the write operation into a particular memory-efficient set of multiple memory-efficient sets assigned to different time periods, wherein said particular memory-efficient set is assigned to a time period inclusive of a time at which the write operation was performed;for each given read operation of multiple read operations performed in the distributed storage system; determining a last-modified time specifying when a value retrieved from a given replicated data object during the given read operation was last modified; determine a particular time period in which a most recent write operation was performed on the given replicated data object, the particular time period being a time period assigned to a memory-efficient set in which the key identifying the given replicated data object has been inserted; and if the last-modified time is older than the particular time period, determining that the value retrieved is inconsistent with a most recent value written to the particular replicated data object wherein if the last-modified time is not older than the particular time period, the value retrieved is not determined to be inconsistent; and generating a consistency metric based on a quantity of instances in which retrieved values are determined to be inconsistent. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory, computer-readable storage medium, storing program instructions computer-executable on a computer to implement a consistency measurement component configured to:
-
generate multiple memory-efficient sets populated with keys identifying replicated data objects of a distributed storage system, each key inserted into a memory-efficient set assigned to a time period in which a replicated data object identified by that key was written to; for a read operation that retrieved a value from a given replicated data object in the distributed storage system, determine a last-modified time specifying when the retrieved value was last modified; identify a particular memory-efficient set in which a key identifying the given replicated data object has been inserted; and in response to determining the last-modified time is older than the time period to which the identified memory-efficient set is assigned, generate an indication that specifies the value retrieved during the read operation is inconsistent with a most recent value written to the given replicated data object. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A system, comprising:
-
a memory comprising multiple memory-efficient sets assigned to different time periods; and one or more processors coupled to the memory, wherein the memory comprises program instructions executable by the one or more processors to implement a consistency measurement component configured to; populate the memory-efficient sets with keys identifying replicated data objects of a distributed storage system, each key inserted into a memory-efficient set assigned to a time period in which a replicated data object identified by that key was written to; for a read operation that retrieved a value from a given replicated data object in the distributed storage system, determine a last-modified time specifying when the retrieved value was last modified; identify a particular memory-efficient set in which a key identifying the given replicated data object has been inserted; and in response to determining the last-modified time is older than the time period to which the identified memory-efficient set is assigned, generate an indication that specifies the value retrieved during the read operation is inconsistent with a most recent value written to the given replicated data object. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31)
-
Specification