Partial pre-aggregation in relational database queries
First Claim
Patent Images
1. A method, implemented by a computing device, for processing a database query, comprising:
- partially pre-aggregating records in a database to provide a result that contains at least two records having like grouping column values;
aggregating records derived from the result that contains at least two records having like grouping column values to provide a result that contains records having unique grouping column values; and
partially pre-aggregating the records in the database only if an estimation, based on a calculation of a probability that a record will be absorbed by a group of records already in memory, indicates that a number of records in the result that contains at least two records having like grouping column values is significantly less than a number of records in the database, wherein the estimation is based on factors comprising;
a number of output records, T(N);
a number of input records, N; and
a relationship
T(N)=M+(N−
M)(1−
A(R(M)))=M+(N−
M)Σ
i=1D(1−
pi)R(M);
wherein M records can fit into memory.
3 Assignments
0 Petitions
Accused Products
Abstract
A partial pre-aggregation database operation improves processing efficiency of database queries by reducing the number of records input into a subsequent database operation, provided the query includes a final aggregation. A query optimizer is provided to determine when it is economical to partially pre-aggregate data records and when it is not. The partial pre-aggregation creates a record store in memory as input records are received. The record store is then used by another database operator, which saves the other database operator from having to re-create the record store.
20 Citations
26 Claims
-
1. A method, implemented by a computing device, for processing a database query, comprising:
-
partially pre-aggregating records in a database to provide a result that contains at least two records having like grouping column values; aggregating records derived from the result that contains at least two records having like grouping column values to provide a result that contains records having unique grouping column values; and partially pre-aggregating the records in the database only if an estimation, based on a calculation of a probability that a record will be absorbed by a group of records already in memory, indicates that a number of records in the result that contains at least two records having like grouping column values is significantly less than a number of records in the database, wherein the estimation is based on factors comprising; a number of output records, T(N); a number of input records, N; and a relationship
T(N)=M+(N−
M)(1−
A(R(M)))=M+(N−
M)Σ
i=1D(1−
pi)R(M);wherein M records can fit into memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16)
-
-
10. A relational database computer program stored on a computer-readable medium, the relational database computer program comprising computer-executable instructions that, when executed on a computer, perform steps comprising:
-
receiving a stream of input records; partially pre-aggregating the input records according to a single grouping column to provide a result that contains at least two records having like grouping column values, wherein the partially pre-aggregating the input records is performed if an estimation, based on a calculation of a probability that a record will be absorbed by a group of records already in memory, indicates that a number of records in the result that contains at least two records having like grouping column values is significantly less than a number of records in the stream of input records, wherein the estimation is based on factors comprising; a number of output records, T(N); a number of input records, N; and a relationship;
T(N)=M+(N−
M)(1−
A(R(M)))=M+(N−
M)Σ
i=1D(1−
pi)R(M);wherein M records can fit into memory; joining the partially pre-aggregated records with other data to create a record store; and aggregating records within the record store to provide a result that contains records having unique grouping column values. - View Dependent Claims (11, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification