Compressing database workloads
First Claim
1. A method to compress a workload including a plurality of statements comprising:
- providing a distance function in a database management system for a pair of statements within an initial workload W containing a plurality of statements;
compressing the initial workload W to generate a compressed workload W′
based on the distance function, the compressed workload W′
including a subset of statements of the plurality of statements of the initial workload W;
determining a generation time indicating a length of time to generate the compressed workload W′
;
establishing a compressed running time indicating a length of time for an application to run the compressed workload W′
;
establishing a total running time of the application indicating a length of time for the application to run the initial workload W; and
estimating a total running time which includes a sum of the generation time and the compressed running time; and
determining a limit on the total running time, the limit on the total running time being less then the initial running time; and
outputting the generated compressed workload W′
.
1 Assignment
0 Petitions
Accused Products
Abstract
Relational database applications such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these applications are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such applications is the size of the workload. The invention concerns workload compression which helps improve the scalability of such applications. The exemplary embodiment is broadly applicable to a variety of workload-driven applications, while allowing for incorporation of application specific knowledge. The process is described in detail in the context of two workload-driven applications: index selection and approximate query processing.
14 Citations
43 Claims
-
1. A method to compress a workload including a plurality of statements comprising:
-
providing a distance function in a database management system for a pair of statements within an initial workload W containing a plurality of statements; compressing the initial workload W to generate a compressed workload W′
based on the distance function, the compressed workload W′
including a subset of statements of the plurality of statements of the initial workload W;determining a generation time indicating a length of time to generate the compressed workload W′
;establishing a compressed running time indicating a length of time for an application to run the compressed workload W′
;establishing a total running time of the application indicating a length of time for the application to run the initial workload W; and estimating a total running time which includes a sum of the generation time and the compressed running time; and determining a limit on the total running time, the limit on the total running time being less then the initial running time; and outputting the generated compressed workload W′
. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer system to compress a workload comprising:
-
a processor; a database management system that receives an initial workload W, the initial workload W including an initial plurality of statements; an evaluation component that determines a distance function for a pair of statements within the initial plurality of statements of the initial workload W; and a search component which evaluates the distance function and compresses the initial workload W to generate a compressed workload W′
including a compressed plurality of statements based on the distance function,wherein the database management estimating a total running time including a sum of a running time of an application on the compressed workload W′
plus a time taken to generate the compressed workload W′
further outputs the compressed workload W′
. - View Dependent Claims (15, 16, 17)
-
-
18. A computer system to compress a workload comprising:
-
a processor; a database management system that receives an initial workload W, the initial workload W including an initial plurality of statements; an evaluation component that determines a distance function for a pair of statements within the initial plurality of statements of the workload W; a search component which compresses the initial workload W to generate a compressed workload W′
including a compressed plurality of statements based on the distance function, the search component estimating;an initial running time which is a running time of an application on the initial workload W; a total running time which is a sum of a running time of the application on the compressed workload W′
plus a time to generate the compressed workload W′
; anda limit on the total running time, the limit on the total running time being less than the initial running time, wherein the database management system receives the comoressed workload W′ and
outputs the compressed workload W′
. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A computer readable storage medium having executable instructions stored on the medium, the execution of the instruction on a computer, perform the following steps
providing a distance function in a database management system for a pair of statements within an initial workload W; -
finding a compressed workload W′
based on the distance function;establishing an initial running time of an application on the initial workload W; establishing a total running time, the total running time being a sum of a running time of the application on the compressed workload W′
plus a time to generate the compressed workload W′
; andestablishing a limit on the total running time, the limit on the total running time being determined by an estimation of the total time running time; and outputting the generated compressed workload W′
. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
Specification