Query and exadata support for hybrid columnar compressed data
First Claim
Patent Images
1. A method comprising:
- for each set of rows of multiple sets of rows in a database table, a storage system storing on a non-volatile storage, said each set of rows in a separate compression unit of a plurality of compression units, wherein said each set of rows comprises columns of the database table that are stored in column major format within said separate compression unit, wherein said plurality of compression units are structures stored on the non-volatile storage for storing rows in said database table;
the storage system receiving a request from a database server to return compression units that include a column having a value that satisfies one or more criteria specified in the request;
in response to said request;
the storage system scanning the plurality of compression units for rows having column values that satisfy the one or more criteria;
the storage system returning one or more compression units, wherein the one or more compression units have a column value that satisfies the one or more criteria;
wherein the method is performed by one or more computing devices.
1 Assignment
1 Petition
Accused Products
Abstract
A method and apparatus is provided for optimizing queries received by a database system that relies on an intelligent data storage server to manage storage for the database system. Storing compression units in hybrid columnar format, the storage manager evaluates simple predicates and only returns data blocks containing rows that satisfy those predicates. The returned data blocks are not necessarily stored persistently on disk. That is, the storage manager is not limited to returning disc block images. The hybrid columnar format enables optimizations that provide better performance when processing typical database workloads including both fetching rows by identifier and performing table scans.
83 Citations
51 Claims
-
1. A method comprising:
-
for each set of rows of multiple sets of rows in a database table, a storage system storing on a non-volatile storage, said each set of rows in a separate compression unit of a plurality of compression units, wherein said each set of rows comprises columns of the database table that are stored in column major format within said separate compression unit, wherein said plurality of compression units are structures stored on the non-volatile storage for storing rows in said database table; the storage system receiving a request from a database server to return compression units that include a column having a value that satisfies one or more criteria specified in the request; in response to said request; the storage system scanning the plurality of compression units for rows having column values that satisfy the one or more criteria; the storage system returning one or more compression units, wherein the one or more compression units have a column value that satisfies the one or more criteria; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 16, 33)
-
-
12. A method comprising:
-
in response to a request for one or more compression units stored in a data storage system; a database system receiving at least one compression unit containing a plurality of rows, wherein at least a portion of said plurality of rows comprise columns, wherein at least one column of the columns is stored in column-major format; rewriting data stored in each compression unit of said at least one compression unit into row-major format, wherein rewriting further comprises; for each compression unit of said at least one compression unit; determining a number of columns for said each compression unit; determining a number of rows based on said number of columns and a processor cache size; iteratively performing until all rows returned in said each compression unit are read;
reading said number of rows from said each compression unit stored in column-major order and storing said number of rows into a matrix;
reading said number of rows from said matrix and writing said number of rows into a return buffer in row-major order; andwherein the method is performed by one or more computing devices. - View Dependent Claims (13, 14, 15)
-
-
17. A computer-readable storage medium storing instructions which when executed by one or more processors cause performance of:
-
for each set of rows of multiple sets of rows in a database table, a storage system storing on a non-volatile storage, said each set of rows in a separate compression unit of a plurality of compression units, wherein said each set of rows comprises columns of the database table that are stored in column major format within said separate compression unit, wherein said plurality of compression units are structures stored on the non-volatile storage for storing rows in said database table; the storage system receiving a request from a database server to return compression units that include a column having a value that satisfies one or more criteria specified in the request; in response to said request; the storage system scanning the plurality of compression units for rows having column values that satisfy the one or more criteria; the storage system returning one or more compression units, wherein the one or more compression units have a column value that satisfies the one or more criteria. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 34)
-
-
29. A computer-readable storage medium storing instructions which, when executed, cause one or more processors to perform the steps of:
in response to a request for one or more compression units stored in a data storage system; a database system receiving at least one compression unit containing a plurality of rows, wherein at least a portion of said plurality of rows comprise columns, wherein at least one column of the columns is stored in column-major format; rewriting data stored in each compression unit of said at least one compression unit into row-major format, wherein rewriting further comprises; for each compression unit of said at least one compression unit; determining a number of columns for said each compression unit; determining a number of rows based on said number of columns and a processor cache size; iteratively performing until all rows returned in said each compression unit are read;
reading said number of rows from said each compression unit stored in column-major order and storing said number of rows into a matrix;
reading said number of rows from said matrix and writing said number of rows into a return buffer in row-major order.- View Dependent Claims (30, 31, 32)
-
35. A system comprising:
-
a database server; and a storage system, comprising one or more non-volatile storage devices, configured to; for each set of rows of multiple sets of rows in a database table, store said each set of rows in a separate compression unit of a plurality of compression units, wherein said each set of rows comprises columns of the database table that are stored in column major format within said separate compression unit, wherein said plurality of compression units are structures stored on the storage system for storing rows in said database table; receive a request from the database server to return compression units that include a column having a value that satisfies one or more criteria specified in the request; in response to said request; scan the plurality of compression units for rows having column values that satisfy the one or more criteria; return one or more compression units, wherein the one or more compression units have a column value that satisfies the one or more criteria. - View Dependent Claims (36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47)
-
-
48. A system comprising:
-
a data storage system, comprising one or more non-volatile storage devices; a database system configured to, in response to a request for one or more compression units stored in the data storage system; receive at least one compression unit containing a plurality of rows, wherein at least a portion of said plurality of rows comprises columns, wherein at least one column of the columns is stored in column-major format; rewrite data stored in each compression unit of said at least one compression unit into row-major format, wherein rewriting further comprises; for each compression unit of said at least one compression unit; determining a number of columns for said each compression unit; determining a number of rows based on said number of columns and a processor cache size; iteratively performing until all rows returned in said each compression unit are read; reading said number of rows from said each compression unit stored in column-major order and storing said number of rows into a matrix; reading said number of rows from said matrix and writing said number of rows into a return buffer in row- major order. - View Dependent Claims (49, 50, 51)
-
Specification