HASH JOIN USING COLLABORATIVE PARALLEL FILTERING IN INTELLIGENT STORAGE WITH OFFLOADED BLOOM FILTERS
First Claim
1. A method comprising:
- a database server sending to a data storage system;
a) a request for data, the request identifying one or more data units stored in the data storage subsystem, wherein the one or more data units are data units in which the data storage system stores data for a first table; and
b) metadata describing one or more characteristics of a second table;
wherein the request is a communication that, when interpreted by the data storage system, causes the data storage system to retrieve said one or more data units from storage;
wherein the metadata is metadata that, when interpreted by the data storage system, causes the data storage system to generate filtered data based upon the retrieved one or more data units and the one or more characteristics of the second table, as described in the metadata; and
in response to the request, the database server receiving the filtered data from the data storage system;
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
Processing resources at a storage system for a database server are utilized to perform aspects of a join operation that would conventionally be performed by the database server. When requesting a range of data units from a storage system, the database server includes join metadata describing aspects of the join operation for which the data is being requested. The join metadata may be, for instance, a bloom filter. The storage system reads the requested data from disk as normal. However, prior to sending the requested data back to the storage system, the storage system analyzes the raw data based on the join metadata, removing a certain amount of data that is guaranteed to be irrelevant to the join operation. The storage system then returns filtered data to the database server. The database system thereby avoids the unnecessary transfer of certain data between the storage system and the database server.
-
Citations
46 Claims
-
1. A method comprising:
-
a database server sending to a data storage system; a) a request for data, the request identifying one or more data units stored in the data storage subsystem, wherein the one or more data units are data units in which the data storage system stores data for a first table; and b) metadata describing one or more characteristics of a second table; wherein the request is a communication that, when interpreted by the data storage system, causes the data storage system to retrieve said one or more data units from storage; wherein the metadata is metadata that, when interpreted by the data storage system, causes the data storage system to generate filtered data based upon the retrieved one or more data units and the one or more characteristics of the second table, as described in the metadata; and in response to the request, the database server receiving the filtered data from the data storage system; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for performing a join operation, the method comprising:
-
a database server sending to a data storage system; a) a request for data, the request identifying one or more data units stored in the data storage subsystem, wherein the one or more data units are data units in which the data storage system stores data for a first table; and b) one or more join filtering conditions; wherein the request, is a communication that, when interpreted by the data storage system, causes; the data storage system to retrieving said one or more data units from storage; the data storage system generating filtered data by applying the one or more join filtering conditions to the retrieved one or more data units; and in response to the request, the database server receiving the filtered data from the data storage system; performing the join operation based on the filtered data; wherein the method is performed by one or more computing devices. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A method comprising a data storage system performing the steps of:
-
receiving a request to retrieve data, wherein the request identifies one or more locations of one or more data units in which the requested data is stored at the data storage system; receiving metadata describing one or more filtering conditions for a join operation to be performed with respect to the requested data; in response to the request, reading the one or more data units from the one or more locations; based on the one or more join filtering conditions, filtering data from the one or more data units, thereby generating filtered data; responding to the request, wherein the response includes the filtered data; wherein the method is performed by one or more computing devices. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A method comprising a data storage system performing the steps of:
-
receiving a request for data from a database server; wherein the request includes join data indicating an operation to join rows of data stored in data units at the data storage system; in response to the request; reading a plurality of data units from storage; generating response data, wherein generating the response data comprises generating the response data by at least filtering the plurality of data units, based on the join data, to exclude one or more rows that are not to be joined for the indicated operation; sending the response data to the database server. - View Dependent Claims (23)
-
-
24. One or more storage media storing instructions which, when executed by one or more computing devices, cause performance of:
-
a database server sending to a data storage system; a) a request for data, the request identifying one or more data units stored in the data storage subsystem, wherein the one or more data units are data units in which the data storage system stores data for a first table; and b) metadata describing one or more characteristics of a second table; wherein the request is a communication that, when interpreted by the data storage system, causes the data storage system to retrieve said one or more data units from storage; wherein the metadata is metadata that, when interpreted by the data storage system, causes the data storage system to generate filtered data based upon the retrieved one or more data units and the one or more characteristics of the second table, as described in the metadata; and in response to the request, the database server receiving the filtered data from the data storage system; wherein the method is performed by one or more computing devices. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. One or more storage media storing instructions which, when executed by one or more computing devices, cause performance of a join operation, wherein performance of the join operation comprises:
-
a database server sending to a data storage system; a) a request for data, the request identifying one or more data units stored in the data storage subsystem, wherein the one or more data units are data units in which the data storage system stores data for a first table; and b) one or more join filtering conditions; wherein the request, is a communication that, when interpreted by the data storage system, causes; the data storage system to retrieving said one or more data units from storage; the data storage system generating filtered data by applying the one or more join filtering conditions to the retrieved one or more data units; and in response to the request, the database server receiving the filtered data from the data storage system; performing the join operation based on the filtered data. - View Dependent Claims (34, 35, 36, 37, 38)
-
-
39. One or more storage media storing instructions which, when executed by one or more computing devices, cause performance of, at a data storage system:
-
receiving a request to retrieve data, wherein the request identifies one or more locations of one or more data units in which the requested data is stored at the data storage system; receiving metadata describing one or more filtering conditions for a join operation to be performed with respect to the requested data; in response to the request, reading the one or more data units from the one or more locations; based on the one or more join filtering conditions, filtering data from the one or more data units, thereby generating filtered data; responding to the request, wherein the response includes the filtered data. - View Dependent Claims (40, 41, 42, 43, 44)
-
-
45. One or more storage media storing instructions which, when executed by one or more computing devices, cause performance of, at a data storage system:
-
receiving a request for data from a database server; wherein the request includes join data indicating an operation to join rows of data stored in data units at the data storage system; in response to the request; reading a plurality of data units from storage; generating response data, wherein generating the response data comprises generating the response data by at least filtering the plurality of data units, based on the join data, to exclude one or more rows that are not to be joined for the indicated operation; sending the response data to the database server. - View Dependent Claims (46)
-
Specification