Materialized samples for a business warehouse query
First Claim
1. A computer-implemented method of improving a query of a relational database, the method comprising:
- providing a table T stored in the relational database, the table T having X number of columns and Y number of rows;
generating a sampling column having Y cells, wherein each cell includes a sampling value for each of the Y rows in the table T, the sampling value, stored on a storage medium, representative of whether a query of the relational database will sample a corresponding one of the Y rows, the sample value of each of the Y rows comprising a first value when the row is not part of a sample, a second value when the row is part of a first sample, and a third value when the row is part of a second sample;
appending the sampling column to the table T stored in the relational database, the table including the appended sampling column to enable the query; and
providing, when a row is added to the table T, one of the first value, the second value, or the third value, to the row in the sampling column.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for improving a query on a relational database in a business intelligence system is provided. A multidimensional data table is provided in the database. A sampling dimension is appended to the data table. The sampling dimension includes a number of cells, and wherein each cell includes a sampling value corresponding to the data of at least one of the dimensions of the data table. The data table is then clustered in at least one of the dimensions based on the associated sampling value in the sampling dimension. A query for a subset of data can then be executed on the clustered data table based on the sampling values.
40 Citations
20 Claims
-
1. A computer-implemented method of improving a query of a relational database, the method comprising:
-
providing a table T stored in the relational database, the table T having X number of columns and Y number of rows; generating a sampling column having Y cells, wherein each cell includes a sampling value for each of the Y rows in the table T, the sampling value, stored on a storage medium, representative of whether a query of the relational database will sample a corresponding one of the Y rows, the sample value of each of the Y rows comprising a first value when the row is not part of a sample, a second value when the row is part of a first sample, and a third value when the row is part of a second sample; appending the sampling column to the table T stored in the relational database, the table including the appended sampling column to enable the query; and providing, when a row is added to the table T, one of the first value, the second value, or the third value, to the row in the sampling column. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for optimizing a query on a relational database, the system comprising:
-
a table T stored in the relational database on a storage medium, the table T having X number of columns and Y number of rows; and a sample generator configured to generate a sampling column having Y cells, wherein each cell includes a sampling value, stored in the storage medium, for each of the Y rows in the table T, the sampling value representative of whether a query of the relational database will sample a corresponding one of the Y rows, and further configured to append the sampling column to the table T stored in the relational database, the table including the appended sampling column to enable the query, the sample value of each of the Y rows comprising a first value when the row is not part of a sample, a second value when the row is part of a first sample, and a third value when the row is part of a second sample. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A computer-implemented method of executing a query on a relational database, the method comprising:
-
providing a multidimensional data table stored in the database; appending a sampling dimension to the data table, wherein the sampling dimension includes a number of cells, and wherein each cell includes a sampling value, stored in a storage medium corresponding to the data of at least one of the dimensions of the data table, the sample value of each of the cells comprises a first value when the row is not part of a sample, a second value when the cell is part of a first sample, and a third value when the cell is part of a second sample; clustering the data table in at least one of the dimensions based on the associated sampling value in the sampling dimension; receiving a query for a subset of data stored in the database; and executing the query on the clustered data table based on the sampling values corresponding to the subset of data stored in the database. - View Dependent Claims (17, 18, 19, 20)
-
Specification