Distributed data store with an orderstamp to ensure progress
First Claim
1. A method for addressing inconsistency and ensuring progress in a distributed data store involving one or more computers, comprising the steps of:
- labeling an entry with an orderstamp, wherein said orderstamp comprises an approximate timestamp comprising a serial identifier, wherein the approximate timestamp comprises an approximate time that the entry originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique;
recording in a storage for each entry in a computer the latest orderstamp among orderstamps for insert and delete operations for that entry that have been processed by the computer;
recording for each entry in the computer whether an operation corresponding to the latest orderstamp is an insert or a delete;
labeling a query with an orderstamp; and
when processing a query on the computer, identifying entries that are in a subset specified by the query, that are covered by the computer, that have latest orderstamp before the orderstamp of the query, and that have operation type insert corresponding to the latest orderstamp.
3 Assignments
0 Petitions
Accused Products
Abstract
A distributed data store labels operations with globally unique identifiers that contain approximate timestamps. The labels are used to address causes of inconsistency in the distributed data store while ensuring progress. A first mode is provided that stores the latest label for each entry is useful if re-inserts and deletes are rare. Another mode is provided that stores a history of labels for each entry can be used if there are many re-inserts and deletes. A further mode is provided that stores a history of labels for queries can report updates to query answers as inserts and deletes settle across the distributed data store.
34 Citations
12 Claims
-
1. A method for addressing inconsistency and ensuring progress in a distributed data store involving one or more computers, comprising the steps of:
-
labeling an entry with an orderstamp, wherein said orderstamp comprises an approximate timestamp comprising a serial identifier, wherein the approximate timestamp comprises an approximate time that the entry originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; recording in a storage for each entry in a computer the latest orderstamp among orderstamps for insert and delete operations for that entry that have been processed by the computer; recording for each entry in the computer whether an operation corresponding to the latest orderstamp is an insert or a delete; labeling a query with an orderstamp; and when processing a query on the computer, identifying entries that are in a subset specified by the query, that are covered by the computer, that have latest orderstamp before the orderstamp of the query, and that have operation type insert corresponding to the latest orderstamp.
-
-
2. A method for addressing inconsistency and ensuring progress in a distributed data store involving one or more computers, comprising the steps of:
-
labeling an insert with an orderstamp, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that the insert originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; labeling a delete with an orderstamp, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that the delete originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; recording for each entry in a computer a latest orderstamp and among orderstamps for insert and delete operations for that entry that have been processed by the computer; recording for each entry in the computer whether an operation corresponding to the latest orderstamp is an insert or a delete; labeling a query with an orderstamp; and when processing a query on a computer, identifying entries that are in a subset specified by the query, that are covered by the computer, that have latest orderstamp before the orderstamp of the query, and that have operation type insert corresponding to the latest orderstamp. - View Dependent Claims (3)
-
-
4. A method for addressing inconsistency and ensuring progress in a distributed data store involving one or more computers, comprising the steps of:
-
labeling an insert with an orderstamp, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that the insert originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; labeling a delete with an orderstamp, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that the delete originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; recording for each entry in a computer a history of inserts and deletes and for that entry that have been processed by the computer and corresponding orderstamps; labeling a query with an orderstamp; when processing a query on a computer, identifying entries that are in a subset specified by the query, that are covered by the computer, that have in the history for the entry an orderstamp before the orderstamp of the query, and that have in the history for the entry latest orderstamp before the orderstamp of the query corresponding to an insert.
-
-
5. A method for addressing inconsistency and ensuring progress in a distributed data store involving one or more computers, comprising the steps of:
-
labeling an insert with an orderstamp, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that the insert originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; labeling a delete with an orderstamp, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that the delete originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; recording for each entry in a computer a history of inserts and deletes and for that entry that have been processed by the computer and corresponding orderstamps; labeling a query with an orderstamp; recording for a computer a history of queries processed by the computer and the corresponding orderstamps; when processing a query on a computer, identifying entries that are in a subset specified by the query, that are covered by the computer, that have in the history for the entry an orderstamp before the orderstamp of the query, and that have in the history for the entry latest orderstamp before the orderstamp of the query corresponding to an insert. - View Dependent Claims (6)
-
-
7. A distributed data store comprising:
-
one or more computers having storage, wherein each of the computers comprises a set of modes of operation, each mode using orderstamps, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that an entry originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; means for labeling an entry with an orderstamp; means for recording in the storage for each entry in each of the computers a latest orderstamp and among orderstamps for insert and delete operations for that entry that have been processed by the computer; means for recording for each entry in each of the computers whether an operation corresponding to the latest orderstamp is an insert or a delete; means for labeling a query in each of the computers with an orderstamp; and when processing a query on each of the computers, means for identifying entries that are in a subset specified by the query, that are covered by the computer, that have latest orderstamp before the orderstamp of the query, and that have operation type insert corresponding to the latest orderstamp. - View Dependent Claims (8)
-
-
9. A method for a computer reclaiming storage in one or more computer-related devices periodically, continuously using a low-priority thread, or when storage is needed, comprising the steps of:
-
determining if memory time exceeds settling time, in which case there is no inconsistency due to settling; determining if a computer records a time up to which it has reclaimed storage (cut time), wherein said computer recognizes and reports operations that arrive with orderstamps that have time earlier than said cut time, said orderstamp comprising an approximate timestamp that includes a serial identifier wherein the approximate timestamp comprises an approximate time that the insert originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique; determining if a computer processes a query when said cut time of said computer is after a query ceiling, wherein said computer reports as possible sources of inconsistency due to settling any entries that are in a subset specified by said query and have an earliest orderstamp in an entry history after a query ceiling; and determining if a computer processes an insert or delete when said cut time of said computer is later than a time of an orderstamp of an insert or delete operation, wherein said computer includes said operation in a history for an entry only if said entry has no history or if said entry history includes an orderstamp before an orderstamp of the insert or delete operation being processed.
-
-
10. A distributed data store, comprising:
-
one or more computers having storage, wherein each of the computers comprises a set of modes of operation, each mode using orderstamps, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that an entry originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique, wherein the orderstamp is used to address any of the following concerns; inconsistency due to duplicate operations; inconsistency due to order of operations; inconsistency due to synchronization, query ceilings; and inconsistency due to settling; wherein said modes comprise; means for labeling an insert with an orderstamp; means for labeling a delete with an orderstamp; means for recording in the storage for each entry in a computer a latest orderstamp and among orderstamps for insert and delete operations for that entry that have been processed by the computer; means for recording for each entry in the computer whether an operation corresponding to the latest orderstamp is an insert or a delete; means for labeling a query with an orderstamp; and when processing a query on a computer, means for identifying entries that are in a subset specified by the query, that are covered by the computer, that have latest orderstamp before the orderstamp of the query, and that have operation type insert corresponding to the latest orderstamp.
-
-
11. A distributed data store, comprising:
-
one or more computers having storage, wherein each of the computers comprises a set of modes of operation, each mode using orderstamps, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that an entry originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique, wherein the orderstamp is used to address any of the following concerns; inconsistency due to duplicate operations; inconsistency due to order of operations; inconsistency due to synchronization, query ceilings; and inconsistency due to settling; wherein said modes comprise; means for labeling an insert with an orderstamp; means for labeling a delete with an orderstamp; means for recording in the storage for each entry in a computer a history of inserts and deletes and for that entry that have been processed by the computer and corresponding orderstamps; means for labeling a query with an orderstamp; and when processing a query on a computer, means for identifying entries that are in a subset specified by the query, that are covered by the computer, that have in the history for the entry an orderstamp before the orderstamp of the query, and that have in the history for the entry latest orderstamp before the orderstamp of the query corresponding to an insert.
-
-
12. A distributed data store, comprising:
-
one or more computers having storage, wherein each of the computers comprises a set of modes of operation, each mode using orderstamps, said orderstamp in turn comprising an approximate timestamp that includes a serial identifier, wherein the approximate timestamp comprises an approximate time that an entry originated on an originating computer, and wherein the serial identifier comprises an identifier unique to the originating computer such that each orderstamp is globally unique, wherein the orderstamp is used to address any of the following concerns; inconsistency due to duplicate operations; inconsistency due to order of operations; inconsistency due to synchronization, query ceilings; and inconsistency due to settling; wherein said modes comprise; means for labeling an insert with an orderstamp; means for labeling a delete with an orderstamp; means for recording in the storage for each entry in a computer a history of inserts and deletes and for that entry that have been processed by the computer and corresponding orderstamps; means for labeling a query with an orderstamp; means for recording for a computer a history of queries processed by the computer and the corresponding orderstamps; and when processing a query on a computer, means for identifying entries that are in a subset specified by the query, that are covered by the computer, that have in the history for the entry an orderstamp before the orderstamp of the query, and that have in the history for the entry latest orderstamp before the orderstamp of the query corresponding to an insert.
-
Specification