Query and exadata support for hybrid columnar compressed data
First Claim
Patent Images
1. A method comprising:
- a storage system comprising a processor and memory storing rows of database tables in a plurality of data blocks in non-volatile storage, each data block of said plurality of data blocks storing one or more columns of one or more rows of a database table of said database tables, said database tables being managed by a database management system (“
DBMS”
) configured to return data from said database tables in response to database queries issued to the DBMS by clients of the DBMS;
wherein said storage system is configured to return, in entirety, requested data blocks to said DBMS in response to a request made via a network by the DBMS, wherein the request identifies the requested data blocks;
said storage system receiving a particular request via said network for particular data blocks filtered according to particular one or more column criteria, said particular request identifying said particular data blocks and said particular one or more column criteria; and
in response to said storage system receiving said particular request for the particular data blocks;
said storage system scanning said particular data blocks for particular rows having column values that satisfy said particular one or more column criteria;
said storage system storing in a return buffer the particular rows having column values that satisfy said particular one or more column criteria; and
said storage system returning the particular rows stored in said return buffer to said DBMS;
wherein scanning said particular data blocks for particular rows having column values that satisfy said particular one or more column criteria includes, for a particular subset of said particular data blocks;
retrieving data values only for a first column identified in a first criterion of said particular one or more column criteria;
evaluating the first criterion based on data values in the first column;
storing, at a position within a first vector, a criteria satisfaction value corresponding to a row of the particular rows that indicates that a data value in the first column of the row that satisfies the first criterion;
determining to evaluate a second criterion;
storing, at a position within a second vector, a second criterion satisfaction value corresponding to a row of the particular rows that indicates that a data value in a second column of the row that satisfies the second criterion;
wherein the second criterion identifies the second column;
performing an operation on the first vector and the second vector and storing results of the operation in a cumulative vector.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus is provided for optimizing queries received by a database system that relies on an intelligent data storage server to manage storage for the database system. Storing compression units in hybrid columnar format, the storage manager evaluates simple predicates and only returns data blocks containing rows that satisfy those predicates. The returned data blocks are not necessarily stored persistently on disk. That is, the storage manager is not limited to returning disc block images. The hybrid columnar format enables optimizations that provide better performance when processing typical database workloads including both fetching rows by identifier and performing table scans.
-
Citations
24 Claims
-
1. A method comprising:
-
a storage system comprising a processor and memory storing rows of database tables in a plurality of data blocks in non-volatile storage, each data block of said plurality of data blocks storing one or more columns of one or more rows of a database table of said database tables, said database tables being managed by a database management system (“
DBMS”
) configured to return data from said database tables in response to database queries issued to the DBMS by clients of the DBMS;wherein said storage system is configured to return, in entirety, requested data blocks to said DBMS in response to a request made via a network by the DBMS, wherein the request identifies the requested data blocks; said storage system receiving a particular request via said network for particular data blocks filtered according to particular one or more column criteria, said particular request identifying said particular data blocks and said particular one or more column criteria; and in response to said storage system receiving said particular request for the particular data blocks; said storage system scanning said particular data blocks for particular rows having column values that satisfy said particular one or more column criteria; said storage system storing in a return buffer the particular rows having column values that satisfy said particular one or more column criteria; and said storage system returning the particular rows stored in said return buffer to said DBMS; wherein scanning said particular data blocks for particular rows having column values that satisfy said particular one or more column criteria includes, for a particular subset of said particular data blocks; retrieving data values only for a first column identified in a first criterion of said particular one or more column criteria; evaluating the first criterion based on data values in the first column; storing, at a position within a first vector, a criteria satisfaction value corresponding to a row of the particular rows that indicates that a data value in the first column of the row that satisfies the first criterion; determining to evaluate a second criterion; storing, at a position within a second vector, a second criterion satisfaction value corresponding to a row of the particular rows that indicates that a data value in a second column of the row that satisfies the second criterion; wherein the second criterion identifies the second column; performing an operation on the first vector and the second vector and storing results of the operation in a cumulative vector. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 17, 19, 20, 21, 22, 23, 24)
-
-
9. A non-transitory computer-readable storage medium storing instructions which when executed by one or more processors cause:
-
a storage system comprising a processor and memory storing rows of database tables in a plurality of data blocks in non-volatile storage, each data block of said plurality of data blocks storing one or more columns of one or more rows of a database table of said database tables, said database tables being managed by a database management system (“
DBMS”
) configured to return data from said database tables in response to database queries issued to the DBMS by clients of the DBMS;wherein said storage system is configured to return, in entirety, requested data blocks to said DBMS in response to a request made via a network by the DBMS, wherein the request identifies the requested data blocks; said storage system receiving a particular request via said network for particular data blocks filtered according to particular one or more column criteria, said particular request identifying said particular data blocks and said particular one or more column criteria; and in response to said storage system receiving said particular request for the particular data blocks; said storage system scanning said particular data blocks for particular rows having column values that satisfy said particular one or more column criteria; said storage system storing in a return buffer the particular rows having column values that satisfy said particular one or more column criteria; and said storage system returning the particular rows stored in said return buffer to said DBMS; wherein scanning said particular data blocks for particular rows having column values that satisfy said particular one or more column criteria includes, for a particular subset of said particular data blocks; retrieving data values only for a first column identified in a first criterion of said particular one or more column criteria; evaluating the first criterion based on data values in the first column; storing, at a position within a first vector, a criteria satisfaction value corresponding to a row of the particular rows that indicates that a data value in the first column of the row that satisfies the first criterion; determining to evaluate a second criterion; storing, at a position within a second vector, a second criterion satisfaction value corresponding to a row of the particular rows that indicates that a data value in a second column of the row that satisfies the second criterion; wherein the second criterion identifies the second column; performing an operation on the first vector and the second vector and storing results of the operation in a cumulative vector. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 18)
-
Specification