Systems and methods for optimizing database queries
First Claim
1. A method of executing a query on a relational database, the method comprising:
- receiving, by a computing device, a plurality of queries, the plurality of queries representing a request to segment data representing a general population into a plurality of segments, wherein the data representing the general population is stored in one or more consumer data records derived from one or more base tables of the relational database;
determining, by the computing device, that the plurality of segments are mutually exclusive;
generating, by the computing device, a consolidated query based on the plurality of queries, the consolidated query comprising instructions that segments the general population into the same plurality of segments as the plurality of queries using one read of the consumer data records; and
,executing, by the computing device, the consolidated query on the consumer data records;
generating, by the computing device, an output data table comprising a record identifier column and a segmentation identifier column;
generating, by the computing device, a record identifier value and a segmentation identifier value for each consumer data record;
storing the record identifier value in the record identifier column of the output data table; and
storing the segmentation identifier value in the segmentation identifier column of the output data table.
1 Assignment
0 Petitions
Accused Products
Abstract
A relational database warehouse system with query optimization capabilities is described that allows for speedy identification of sets of records of interest from amongst tens of millions of records. The records of interest may include complex derived attributes, generated, at least in part, by aggregating data from a plurality of records in base data tables. In various embodiments, the query optimization capabilities allow the database warehouse system to identify conditions under which normal query execution may be replaced by one or more optimized execution methods, including, for example, eliminating unnecessary inner join operations on base data tables specified by a query, re-ordering the execution of group-by operations and left-outer join operations to greatly reduce the size of join tables produced while processing a query, and/or consolidating a set of segmentation queries for execution in one pass over the records of the database.
128 Citations
8 Claims
-
1. A method of executing a query on a relational database, the method comprising:
-
receiving, by a computing device, a plurality of queries, the plurality of queries representing a request to segment data representing a general population into a plurality of segments, wherein the data representing the general population is stored in one or more consumer data records derived from one or more base tables of the relational database; determining, by the computing device, that the plurality of segments are mutually exclusive; generating, by the computing device, a consolidated query based on the plurality of queries, the consolidated query comprising instructions that segments the general population into the same plurality of segments as the plurality of queries using one read of the consumer data records; and
,executing, by the computing device, the consolidated query on the consumer data records; generating, by the computing device, an output data table comprising a record identifier column and a segmentation identifier column; generating, by the computing device, a record identifier value and a segmentation identifier value for each consumer data record; storing the record identifier value in the record identifier column of the output data table; and storing the segmentation identifier value in the segmentation identifier column of the output data table. - View Dependent Claims (2, 3, 4)
-
-
5. A database warehouse system with query optimization capabilities, the system comprising:
-
a computer-based storage system for general population data managed by a database management system that stores the general population data logically as a plurality of base data tables; a plurality of virtual tables comprising calculated data values based, at least in part, on the general population data stored in the base data tables, the plurality of virtual tables including a virtual attribute table, the virtual attribute table comprising virtual attribute data records, the virtual attribute data records comprising attribute data values that are based, at least in part on the calculated data values; a query management system configured to receive queries and, in conjunction with the database management system, to use the virtual attribute table in executing the queries, wherein the query management system and the database management system comprise code that collectively; receives a plurality of queries representing a request to segment the general population into a plurality of segments; determines that the plurality of segments represent mutually exclusive segments of the general population; generates a consolidated query based on the plurality of queries, the consolidated query comprising instructions that when executed segment the general population in the same plurality of segments as the plurality of queries using one read of the virtual attribute data records of the virtual attribute table; executes the consolidated query on the virtual attribute table; generates an output data table comprising a record identifier column and a segmentation identifier column; generates a record identifier value and a segmentation identifier value for each consumer data record; stores the record identifier value in the record identifier column of the output data table; and stores the segmentation identifier value in the segmentation identifier column of the output data table. - View Dependent Claims (6, 7, 8)
-
Specification