Filtering queried data on data stores
First Claim
1. A method of fulfilling queries targeting a data set comprising a set of records that are stored in a data store accessible to a computer having a processor, the method comprising:
- executing, on the processor, instructions that cause the computer to;
receive a query targeting the data set, wherein the query specifies a set of selected attributes of the data set and a computation to be applied only to a portion of the data set that matches at least one filter criterion;
partition the query into a filter portion that filters the data set into a filtered data subset according to the at least one filter criterion, and a computation portion specifying at least one computation to be performed only on the filtered data subset of the data store;
according to the filter portion of the query, generate a filtering request to retrieve a filtered data subset comprising a first portion of the data set satisfying the at least one filter criterion and excluding a second portion of the data set not satisfying the at least one filter criterion, according to the at least one filter criterion distinguishing the first portion from the second portion of the data set;
send the generated filtering request to the data store to cause the data store to return the selected attributes of all records satisfying the filter criterion and exclude all records not satisfying the filter criterion; and
responsive to receiving the selected attributes of the filtered data subset from the data store responsive to the filtering request, apply the computation portion of the query to the filtered data subset.
2 Assignments
0 Petitions
Accused Products
Abstract
A data set may be distributed over many data stores, and a query may be distributively evaluated by several data stores with the results combined to form a query result (e.g., utilizing a MapReduce framework). However, such architectures may violate security principles by performing sophisticated processing, including the execution of arbitrary code, on the same machines that store the data. Instead of processing queries, a data store may be configured only to receive requests specifying one or more filtering criteria, and to provide the data items satisfying the filtering criteria. A compute node may apply a query by generating a request including one or more filter criteria, providing the request to a data node, and applying the remainder of the query (including sophisticated processing, and potentially the execution of arbitrary code) to the data items provided by the data node, thereby improving the security and efficiency of query processing.
-
Citations
20 Claims
-
1. A method of fulfilling queries targeting a data set comprising a set of records that are stored in a data store accessible to a computer having a processor, the method comprising:
executing, on the processor, instructions that cause the computer to; receive a query targeting the data set, wherein the query specifies a set of selected attributes of the data set and a computation to be applied only to a portion of the data set that matches at least one filter criterion; partition the query into a filter portion that filters the data set into a filtered data subset according to the at least one filter criterion, and a computation portion specifying at least one computation to be performed only on the filtered data subset of the data store; according to the filter portion of the query, generate a filtering request to retrieve a filtered data subset comprising a first portion of the data set satisfying the at least one filter criterion and excluding a second portion of the data set not satisfying the at least one filter criterion, according to the at least one filter criterion distinguishing the first portion from the second portion of the data set; send the generated filtering request to the data store to cause the data store to return the selected attributes of all records satisfying the filter criterion and exclude all records not satisfying the filter criterion; and responsive to receiving the selected attributes of the filtered data subset from the data store responsive to the filtering request, apply the computation portion of the query to the filtered data subset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
14. A device that applies queries to a data set comprising a set of records that are stored by a remote data store accessible through a remote server, the device comprising:
-
a processor, and a memory storing instructions that, when executed by the processor, instructions that cause the device to; receive a query targeting the data set, wherein the query specifies a set of selected attributes on the data set and a computation to be applied only to a portion of the data set that matches at least one filter criterion; partition the query into a filter portion that filters the data set into a filtered data subset that satisfies the at least one filter criterion, and a computation portion specifying at least one computation to be performed only on the filtered data subset; according to the filter portion of the query, generate a filtering request to retrieve a filtered data subset comprising a first portion of the data set satisfying the at least one filter criterion and excluding a second portion of the data set not satisfying the at least one filter criterion, according to the at least one filter criterion distinguishing the first portion from the second portion of the data set; send the generated filtering request to the remote data store to return the selected attributes of all records satisfying the filter criterion and excluding all records not satisfying the filter criterion; and responsive to receiving the selected attributes of the filtered data subset from the remote data store responsive to the filtering request, apply the computation portion of the query to the filtered data subset. - View Dependent Claims (15, 16)
-
-
17. A memory device storing instructions that, when executed on a processor of a computer having access to a data store, cause the device to apply queries to a data set stored by the data store, by:
-
receive a query targeting the data set and specifying a set of selected attributes of the data set and a computation to be applied only to a portion of the data set that matches at least one filter criterion; partition the query into a filter portion that filters the data set into a filtered data subset according to the at least one filter criterion, and a computation portion specifying at least one computation to be performed only on the filtered data subset of the data store; according to the filter portion of the query, generate a filtering request to retrieve a filtered data subset comprising a first portion of the data set satisfying the at least one filter criterion and excluding a second portion of the data set not satisfying the at least one filter criterion, according to the at least one filter criterion distinguishing the first portion from the second portion of the data set; send the generated filtering request to the data store to cause the data store to return the selected attributes of all records satisfying the filter criterion and exclude all records not satisfying the filter criterion; and responsive to receiving the selected attributes of the filtered data subset from the data store, apply the computation portion of the query only to the filtered data subset. - View Dependent Claims (18, 19, 20)
-
Specification