Data management platform
First Claim
1. A computer-implemented method comprising:
- maintaining, by a first computing member of a distributed data management system, a data region comprising a plurality of data entry keys and respective data entry values;
maintaining, by the first computing member of the distributed data management system, a plurality of log entries in an event file of an operation log for the data region maintained by the first computing member of the distributed data management system, wherein the event file stores log entries representing respective requests to create or update a respective data entry of the data region, wherein each log entry in the event file has a unique event identifier;
generating, by the first computing member in operational memory, a first index that stores a mapping between data entry values and respective data entry keys;
generating, by the first computing member in non-operational memory, an index reference file that stores a mapping between data entry values occurring in the first index and respective unique event identifiers of log entries representing respective requests to create or update data entry keys to have one of the data entry values occurring in the first index;
receiving, by a second computing member of the distributed data management system, a request to generate a second index in operational memory from the index reference file stored in non-operational memory; and
in response to the request, generating, by the second computing member in operational memory, a second index that stores a mapping between data entry values stored in the index reference file and respective unique event identifiers of log entries stored in the index reference file, and wherein each log entry represented in the second index can be used by the second computing member to restore a mapping between a particular data entry value and one or more data entry keys that was previously represented in the first index.
5 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributed data management. One of the methods includes maintaining, by a first member in a distributed data management system having multiple computing members installed on multiple respective computers, a first garbage collection version vector that includes, for each member in the distributed data management system, a garbage collection version that represents a number of garbage collection processes performed by the member on a respective copy of a replicated data region maintained by the member in the data management system. If the first garbage collection version vector is different than a second garbage collection version vector received from a different provider member, a first replication process is performed that is different than a second replication process that is performed when the first garbage collection version vector matches the second garbage collection version vector.
-
Citations
18 Claims
-
1. A computer-implemented method comprising:
-
maintaining, by a first computing member of a distributed data management system, a data region comprising a plurality of data entry keys and respective data entry values; maintaining, by the first computing member of the distributed data management system, a plurality of log entries in an event file of an operation log for the data region maintained by the first computing member of the distributed data management system, wherein the event file stores log entries representing respective requests to create or update a respective data entry of the data region, wherein each log entry in the event file has a unique event identifier; generating, by the first computing member in operational memory, a first index that stores a mapping between data entry values and respective data entry keys; generating, by the first computing member in non-operational memory, an index reference file that stores a mapping between data entry values occurring in the first index and respective unique event identifiers of log entries representing respective requests to create or update data entry keys to have one of the data entry values occurring in the first index; receiving, by a second computing member of the distributed data management system, a request to generate a second index in operational memory from the index reference file stored in non-operational memory; and in response to the request, generating, by the second computing member in operational memory, a second index that stores a mapping between data entry values stored in the index reference file and respective unique event identifiers of log entries stored in the index reference file, and wherein each log entry represented in the second index can be used by the second computing member to restore a mapping between a particular data entry value and one or more data entry keys that was previously represented in the first index. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; maintaining, by a first computing member of a distributed data management system, a data region comprising a plurality of data entry keys and respective data entry values; maintaining, by the first computing member of the distributed data management system, a plurality of log entries in an event file of an operation log for the data region maintained by the first computing member of the distributed data management system, wherein the event file stores log entries representing respective requests to create or update a respective data entry of the data region, wherein each log entry in the event file has a unique event identifier; generating, by the first computing member in operational member, a first index that stores a mapping between data entry values and respective data entry keys; generating, by the first computing member in non-operational memory, an index reference file that stores a mapping between data entry values occurring in the first index and respective unique event identifiers of log entries representing respective requests to create or update data entry keys to have one of the data entry values occurring in the first index; receiving, by a second computing member of the distributed data management system, a request to generate a second index in operational memory from the index reference file stored in non-operational memory; and in response to the request, generating, by the second computing member in operational memory, a second index that stores a mapping between data entry values stored in the index reference file and respective unique event identifiers of log entries stored in the index reference file, and wherein each log entry represented in the second index can be used by the second computing member to restore a mapping between a particular data entry value and one or more data entry keys that was previously represented in the first index. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product, encoded on one or more non-transitory computer storage media, comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
maintaining, by the first computing member of the distributed data management system, a plurality of log entries in an event file of an operation log for the data region maintained by the first computing member of the distributed data management system, wherein the event file stores log entries representing respective requests to create or update a respective data entry of the data region, wherein each log entry in the event file has a unique event identifier; generating, by the first computing member in operational memory, a first index that stores a mapping between data entry values and respective data entry keys; generating, by the first computing member in non-operational memory, an index reference file that stores a mapping between data entry values occurring in the first index and respective unique event identifiers of log entries representing respective requests to create or update data entry keys to have one of the data entry values occurring in the first index; receiving, by a second computing member of the distributed data management system, a request to generate a second index in operational memory from the index reference file stored in non-operational memory; and in response to the request, generating, by the second computing member in operational memory, a second index that stores a mapping between data entry values stored in the index reference file and respective unique event identifiers of log entries stored in the index reference file, and wherein each log entry represented in the second index can be used by the second computing member to restore a mapping between a particular data entry value and one or more data entry keys that was previously represented in the first index. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification