Automatic vertical-database design
First Claim
1. A computer-implemented method comprising:
- for each of multiple candidate sort orders of a table having rows and columns, determining a cost based on candidate training queries that have execution costs that satisfy an improvement criterion;
based on the respective costs of the candidate sort orders, selecting sort orders at least one of which satisfies all of the candidate training queries;
for each of the selected sort orders, adding, to a design of a database, an included projection, wherein a projection is a subset of the columns of the table sorted according to a same order, the columns of the projection to be physically stored in accordance with the selected sort order; and
each of the multiple candidate sort orders from which the sort orders are selected defining an order with respect to a putative projection comprising a view of one or more columns that are not necessarily identical to the columns of the included projection.
9 Assignments
0 Petitions
Accused Products
Abstract
An automatic physical-layout designer for a database-management system determines the database'"'"'s physical layout from a set of training queries, the database'"'"'s logical design, and a parameter k that indicates how many storage nodes can be lost without losing access to any of the data. The designer lays the database out as a column store such that the stored columns constitute redundant projections on the system'"'"'s different storage nodes. It repeatedly identifies a projection, whose addition to the design will result in the greatest performance improvement for the training queries. In doing so, it takes into account the different compression formats to which the different projections lend themselves. When a projection has been identified as one to be added, it is added on one node, and k projections having the same columns are added to other nodes. The designer continues thus adding projections until a space budget has been reached.
36 Citations
30 Claims
-
1. A computer-implemented method comprising:
-
for each of multiple candidate sort orders of a table having rows and columns, determining a cost based on candidate training queries that have execution costs that satisfy an improvement criterion; based on the respective costs of the candidate sort orders, selecting sort orders at least one of which satisfies all of the candidate training queries; for each of the selected sort orders, adding, to a design of a database, an included projection, wherein a projection is a subset of the columns of the table sorted according to a same order, the columns of the projection to be physically stored in accordance with the selected sort order; and each of the multiple candidate sort orders from which the sort orders are selected defining an order with respect to a putative projection comprising a view of one or more columns that are not necessarily identical to the columns of the included projection. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer-readable medium containing instructions operable to cause a computer system to perform operations comprising:
-
for each of multiple candidate sort of a table having rows and columns, determining a cost based on candidate training queries that have execution costs that satisfy an improvement criterion; based on the respective costs of the candidate sort orders, selecting sort orders at least one of which satisfies all of the candidate training queries; and in addition to any indexes for the database, for each of the selected sort orders, adding, to a design of a database, an included projection, wherein a projection is a subset of the columns of the table sorted according to a same order, the columns of the projection to be physically stored in accordance with the selected sort order; and each of the multiple candidate sort orders from which the sort orders are selected defining an order with respect to a putative projection comprising a view of one or more columns that are not necessarily identical to the columns of the included projection. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system comprising:
-
a computer-readable medium containing instructions; and one or more processors operable to execute the instructions to perform operations comprising; for each of multiple candidate sort orders of a table having rows and columns, determining a cost based on candidate training queries that have execution costs that satisfy an improvement criterion; based on the respective costs of the candidate sort orders, selecting sort orders at least one of which satisfies all of the candidate training queries; and for each of the selected sort orders, adding, to a design of a database, an included projection, wherein a projection is a subset of the columns of the table sorted according to a same order, the columns of the projection to be physically stored in accordance with the selected sort order; and each of the multiple candidate sort orders from which the sort orders are selected defining an order with respect to a putative projection comprising a view of one or more columns that are not necessarily identical to the columns of the included projection. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification