Query and Exadata Support for Hybrid Columnar Compressed Data
First Claim
1. A method comprising:
- a storage system storing rows of a database table in one or more persistently stored compression units, wherein each compression unit of the one or more persistently stored compression units stores, in hybrid-row-columnar format, a different subset of said rows;
said storage system receiving a request to return compression units, wherein the request includes one or more criteria and the requested compression units contain rows with columns having values that satisfy said one or more criteria;
said storage system scanning said persistently stored compression units for rows having column values that satisfy said criteria;
said storage system returning in returned compression units'"'"' rows having column values that satisfy said criteria, wherein at least one returned compression unit of said returned compression units includes a first row and a second row, wherein the first row and the second row are stored in different persistently stored compression units.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus is provided for optimizing queries received by a database system that relies on an intelligent data storage server to manage storage for the database system. Storing compression units in hybrid columnar format, the storage manager evaluates simple predicates and only returns data blocks containing rows that satisfy those predicates. The returned data blocks are not necessarily stored persistently on disk. That is, the storage manager is not limited to returning disc block images. The hybrid columnar format enables optimizations that provide better performance when processing typical database workloads including both fetching rows by identifier and performing table scans.
141 Citations
15 Claims
-
1. A method comprising:
-
a storage system storing rows of a database table in one or more persistently stored compression units, wherein each compression unit of the one or more persistently stored compression units stores, in hybrid-row-columnar format, a different subset of said rows; said storage system receiving a request to return compression units, wherein the request includes one or more criteria and the requested compression units contain rows with columns having values that satisfy said one or more criteria; said storage system scanning said persistently stored compression units for rows having column values that satisfy said criteria; said storage system returning in returned compression units'"'"' rows having column values that satisfy said criteria, wherein at least one returned compression unit of said returned compression units includes a first row and a second row, wherein the first row and the second row are stored in different persistently stored compression units. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method comprising:
-
in response to a request for one or more compression units stored in a data storage system, a database system receiving at least one compression unit containing a plurality of rows, wherein said plurality of rows are rows of a database table stored in hybrid-row-columnar format; wherein said at least one compression unit includes a first row and a second row, wherein a first persistently stored compression unit that stores the first row is different from a second persistently stored compression unit that stores second row; rewriting data stored in each compression unit of said at least one compression unit into row-major format, wherein rewriting further comprises; determining a number of columns for said each compression unit; determining a number of rows based on said number of columns and a processor cache size; iteratively performing until all rows returned in said each compression unit are read; reading said number of rows from said each compression unit stored in column-major order and storing said number of rows into a matrix; reading said number of rows from said matrix and writing said number of rows into a return buffer in row-major order; and wherein the method is performed by one or more computing devices. - View Dependent Claims (13, 14, 15)
-
Specification