Using overlapping partitions of data for query optimization
First Claim
1. A method for executing queries that specify data from a set of data that has been partitioned into a plurality of partitions based on a first key, the method comprising the computer implemented steps of:
- receiving a query that includes a reference to a second key, wherein said second key is not part of said first key but has a predetermined correlation with said first key;
selecting a subset of said plurality of partitions to scan based on said reference to said second key and said predetermined correlation with said first key; and
executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for executing queries on a set of data that has been partitioned into a plurality of partitions based on a partitioning key is provided. A query is received that includes a reference to a second key. The second key is not part of the partitioning key but has a predetermined correlation with the partitioning key. This second key is referred to as an overlapping partition key. A subset of the plurality of partitions is selected to be scanned based on the reference to the second key and the predetermined correlation with the partitioning key. The query is then executed by scanning only those partitions of the plurality of partitions that belong to the subset of partitions. The overlapping partition key provides for reduced query execution time even when the partitioning key is not directly involved in the query. Specifically, the overlapping partition key permits a partial table scan in situations that would require a fill table scan with partitioning alone.
-
Citations
37 Claims
-
1. A method for executing queries that specify data from a set of data that has been partitioned into a plurality of partitions based on a first key, the method comprising the computer implemented steps of:
-
receiving a query that includes a reference to a second key, wherein said second key is not part of said first key but has a predetermined correlation with said first key; selecting a subset of said plurality of partitions to scan based on said reference to said second key and said predetermined correlation with said first key; and executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions. - View Dependent Claims (2, 3, 4, 5, 6, 25)
-
-
7. A method of producing query plans for executing queries on a set of data that has been partitioned into a plurality of partitions based upon a first key, the method comprising the computer implemented steps of:
-
receiving a query; selecting a subset of partitions to scan from said plurality of partitions, the selection being performed by if said query includes a reference to a second key and does not refer to said first key, then determining said subset of partitions to scan based on said reference to said second key and a predetermined correlation between said second key and said first key, wherein said second key is not part of said first key; and producing a query plan which includes only those partitions of said plurality of partitions that belong to said subset of partitions. - View Dependent Claims (8, 9, 26)
-
-
10. A method for executing queries that specify data from a set of data that has been partitioned into a plurality of partitions based on a first key, the method comprising the computer implemented steps of:
-
receiving a query that includes a reference to a second key; accessing one or more predicates, wherein said one or more predicates represent a predetermined correlation between said first key and said second key; selecting a subset of said plurality of partitions to scan based on said reference to said second key and said one or more predicates; and executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A method for executing queries that request data from a set of data that has been partitioned into a plurality of partitions based on a first key, the method comprising the computer implemented steps of:
-
receiving a query that includes a reference to an attribute that is part of a second key, wherein said second key is not part of said first key but has a predetermined correlation with said first key; accessing a set of values associated with said attribute; selecting a subset of said plurality of partitions to scan based on said reference to said attribute and said set of values; and executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions. - View Dependent Claims (16, 17)
-
-
18. A method for executing queries on a set of data that has been partitioned into a plurality of partitions based on a first key, wherein said first key includes one or more attributes, the method comprising the computer implemented steps of:
-
receiving a query that includes a reference to a value from one of said one or more attributes, wherein a first set of data containing said value is stored in a first partition of said plurality of partitions and a second set of data contaning said value is stored in a second partition of said plurality of partitions; selecting a subset of said plurality of partitions to scan based on said reference, wherein said subset includes said first and second partitions; and executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions. - View Dependent Claims (19, 20, 21)
-
-
22. A method for executing queries on a set of data that has been partitioned into a plurality of partitions, the method comprising the computer implemented steps of:
-
receiving a query that includes a reference to a first key; accessing data that indicates upper and lower boundary values for said plurality of partitions, wherein said upper and lower boundary values of each of said plurality of partitions are independent of the upper and lower boundary values of the other of said plurality of partitions; and selecting a subset of said plurality of partitions upon which to execute said query based on said reference and said upper and lower boundary values of each of said plurality of partitions.
-
-
23. A computer system comprising:
-
a processor; and a memory coupled to said processor, said memory having stored therein a first set of data that has been partitioned into a plurality of partitions, a second set of data indicating upper boundary values for each of said plurality of partitions, a third set of data, separate from said second set of data, indicating lower boundary values for each of said plurality of partitions, and sequences of instructions which, when executed by said processor, cause said processor to select a subset of said plurality of partitions to scan based on a set of query predicates and said upper and lower boundary values of said plurality of partitions.
-
-
24. A machine-readable medium having stored thereon data representing sequences of instructions, said sequences of instructions including sequences of instructions which, when executed by a processor, cause said processor to perform the steps of:
-
receiving a query on a set of data, wherein said set of data has been partitioned into a plurality of partitions based on a first key, wherein said query includes a reference to a second key, and wherein said second key is not part of said first key but has a predetermined correlation with said first key; selecting a subset of said plurality of partitions to scan based on said reference to said second key and said predetermined correlation with said first key; and executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions. - View Dependent Claims (27, 28, 29, 30, 31, 32)
-
-
33. A machine-readable medium carrying one or more sequences of instructions for producing query plans for executing queries on a set of data that has been partitioned into a plurality of partitions based upon a first key, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
receiving a query; selecting a subset of partitions to scan from said plurality of partitions, the selection being performed by if said query includes a reference to a second key and does not refer to said first key, then determining said subset of partitions to scan based on said reference to said second key and a predetermined correlation between said second key and said first key, wherein said second key is not part of said first key; and producing a query plan which includes only those partitions of said plurality of partitions that belong to said subset of partitions.
-
-
34. A machine-readable medium carrying one or more sequences of instructions for executing queries that specify data from a set of data that has been partitioned into a plurality of partitions based on a first key, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
receiving a query that includes a reference to a second key; accessing one or more predicates, wherein said one or more predicates represent a predetermined correlation between said first key and said second key; selecting a subset of said plurality of partitions to scan based on said reference to said second key and said one or more predicates; and executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions.
-
-
35. A machine-readable medium carrying one or more sequences of instructions for executing queries that request data from a set of data that has been partitioned into a plurality of partitions based on a first key, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
receiving a query that includes a reference to an attribute that is part of a second key, wherein said second key is not part of said first key but has a predetermined correlation with said first key; accessing a set of values associated with said attribute; selecting a subset of said plurality of partitions to scan based on said reference to said attribute and said set of values; and executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions.
-
-
36. A machine-readable medium carrying one or more sequences of instructions for executing queries on a set of data that has been partitioned into a plurality of partitions based on a first key, wherein said first key includes one or more attributes, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
receiving a query that includes a reference to a value from one of said one or more attributes, wherein a first set of data containing said value is stored in a first partition of said plurality of partitions and a second set of data containing said value is stored in a second partition of said plurality of partitions; selecting a subset of said plurality of partitions to scan based on said reference, wherein said subset includes said first and second partitions; and executing said query by scanning only those partitions of said plurality of partitions that belong to said subset of partitions.
-
-
37. A machine-readable medium carrying one or more sequences of instructions for executing queries on a set of data that has been partitioned into a plurality of partitions, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
receiving a query that includes a reference to a first key; accessing data that indicates upper and lower boundary values for said plurality of partitions, wherein said upper and lower boundary values of each of said plurality of partitions are independent of the upper and lower boundary values of the other of said plurality of partitions; and selecting a subset of said plurality of partitions upon which to execute said query based on said reference and said upper and lower boundary values of each of said plurality of partitions.
-
Specification