Front end bloom filters in distributed databases
First Claim
1. A computing apparatus comprising:
- one or more computer readable storage media;
a processing system operatively coupled with the one or more computer readable storage media; and
program instructions stored on the one or more computer readable storage media, that when executed by the processing system, direct the processing system to at least;
provide an interface to a database service that hosts at least a data store spanning a plurality of storage elements distributed with respect to each other;
receive, in the interface, lookup requests issued by requesting entities to determine if target keys indicated by the lookup requests are presently stored by the data store;
process the lookup requests with at least a first bloom filter to determine presence statuses comprising absence or potential presence of the target keys in the data store, wherein the first bloom filter is initialized by at least performing a hashing process on data stored into the data store;
based at least on determining the target keys are absent from the data store, indicate to the requesting entities the absence as the presence statuses responsive to the lookup requests; and
based at least on determining the target keys are potentially present in the data store, process the target keys with one or more second bloom filters corresponding individually to the plurality of storage elements to determine which one or more storage elements among the plurality of storage elements potentially store the target keys, issue one or more queries to the one or more storage elements to determine confirmed presence statuses of the target keys, and indicate the confirmed presences statuses to the requesting entities.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, apparatuses, and software for distributed database systems in computing environments are provided herein. In one example, a method of operating a database system is provided that includes providing an interface to a database service that hosts at least a data store across a plurality of storage elements distributed with respect to each other, and receiving, in the interface, lookup requests to determine if first keys indicated by the lookup requests are present in the data store. The method includes processing the lookup requests with at least a bloom filter initialized with second keys associated with the data store to determine presence statuses of the first keys with respect to the data store, and indicating the presence statuses responsive to the lookup requests.
-
Citations
20 Claims
-
1. A computing apparatus comprising:
-
one or more computer readable storage media; a processing system operatively coupled with the one or more computer readable storage media; and program instructions stored on the one or more computer readable storage media, that when executed by the processing system, direct the processing system to at least; provide an interface to a database service that hosts at least a data store spanning a plurality of storage elements distributed with respect to each other; receive, in the interface, lookup requests issued by requesting entities to determine if target keys indicated by the lookup requests are presently stored by the data store; process the lookup requests with at least a first bloom filter to determine presence statuses comprising absence or potential presence of the target keys in the data store, wherein the first bloom filter is initialized by at least performing a hashing process on data stored into the data store; based at least on determining the target keys are absent from the data store, indicate to the requesting entities the absence as the presence statuses responsive to the lookup requests; and based at least on determining the target keys are potentially present in the data store, process the target keys with one or more second bloom filters corresponding individually to the plurality of storage elements to determine which one or more storage elements among the plurality of storage elements potentially store the target keys, issue one or more queries to the one or more storage elements to determine confirmed presence statuses of the target keys, and indicate the confirmed presences statuses to the requesting entities. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of operating a distributed database system, the method comprising:
-
providing an interface to a database service that hosts at least a data store spanning a plurality of storage elements distributed with respect to each other; receiving, in the interface, lookup requests issued by requesting entities to determine if target keys indicated by the lookup requests are presently stored by the data store; processing the lookup requests with at least a first bloom filter to determine presence statuses comprising absence or potential presence of the target keys in the data store, wherein the bloom filter is initialized by at least performing a hashing process on data stored into the data store; based at least on determining the target keys are absent from the data store, indicating to the requesting entities the absence as the presence statuses responsive to the lookup requests; and based at least on determining the target keys are potentially present in the data store, processing the target keys with one or more second bloom filters corresponding individually to the plurality of storage elements to determine which one or more storage elements among the plurality of storage elements potentially store the target keys, issuing one or more queries to the one or more storage elements to determine confirmed presence statuses of the target keys, and indicating the confirmed presences statuses to the requesting entities. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computing apparatus comprising:
-
one or more computer readable storage media; a processing system operatively coupled with the one or more computer readable storage media; and program instructions stored on the one or more computer readable storage media, that when executed by the processing system, direct the processing system to at least; provide an interface to a database service that distributes one or more databases over a plurality of data centers, the interface configured to receive requests issued by requesting entities for key lookups among the one or more databases; responsive to the requests for key lookups, employ at least one primary bloom filter cached locally to the interface to determine if data keys indicated by the requests for key lookups are potentially present in or absent from the one or more databases distributed over the plurality of data centers, wherein the at least one primary bloom filter is initialized using current data keys corresponding to data presently stored in the one or more databases; based at least on determining the data keys are absent from the one or more databases, indicate to the requesting entities presence statuses responsive to the requests for key lookups indicating absences of the data keys with respect to the one or more databases; and based at least on determining the data keys are potentially present in the one or more databases, process the data keys with one or more secondary bloom filters corresponding individually to the plurality of data centers to determine which one or more data centers among the plurality of data centers potentially store the data keys, issue one or more queries to the one or more data centers to determine confirmed presence statuses of the data keys, and indicate the confirmed presences statuses to the requesting entities. - View Dependent Claims (18, 19, 20)
-
Specification