Storing compression units in relational tables
First Claim
1. A method comprising:
- storing data for database tables in a storage device, the storage device comprising a memory organized into data blocks conforming to a data block format, the data block format including data block metadata and one or more data block rows;
storing uncompressed data for a first plurality of the database tables in a first set of the data blocks, each data block of the first set of the data blocks comprising a plurality of data block rows, each data block row of the plurality of data block rows storing data from only one table row from the first plurality of the database tables;
for a particular table that is not in the first plurality of the database tables, generating compressed data by compressing data from multiple table rows in the particular table, wherein said compressing comprises generating the compressed data and structuring the compressed data in a compression unit that does not conforms to the data block format;
storing the compression unit in one or more data block rows of a second set of one or more of the data blocks, said storing comprising storing at least a portion of the compression unit in a particular data block row of a particular data block of the data blocks;
wherein the portion of the compression unit comprises compressed data from a plurality of the table rows from the particular table, wherein the particular data block row thus comprises compressed data from the plurality of the table rows;
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
A database server stores compressed units in data blocks of a database. A table (or data from a plurality of rows thereof) is first compressed into a “compression unit” using any of a wide variety of compression techniques. The compression unit is then stored in one or more data block rows across one or more data blocks. As a result, a single data block row may comprise compressed data for a plurality of table rows, as encoded within the compression unit. Storage of compression units in data blocks maintains compatibility with existing data block-based databases, thus allowing the use of compression units in preexisting databases without modification to the underlying format of the database. The compression units may, for example, co-exist with uncompressed tables. Various techniques allow a database server to optimize access to data in the compression unit, so that the compression is virtually transparent to the user.
79 Citations
54 Claims
-
1. A method comprising:
-
storing data for database tables in a storage device, the storage device comprising a memory organized into data blocks conforming to a data block format, the data block format including data block metadata and one or more data block rows; storing uncompressed data for a first plurality of the database tables in a first set of the data blocks, each data block of the first set of the data blocks comprising a plurality of data block rows, each data block row of the plurality of data block rows storing data from only one table row from the first plurality of the database tables; for a particular table that is not in the first plurality of the database tables, generating compressed data by compressing data from multiple table rows in the particular table, wherein said compressing comprises generating the compressed data and structuring the compressed data in a compression unit that does not conforms to the data block format; storing the compression unit in one or more data block rows of a second set of one or more of the data blocks, said storing comprising storing at least a portion of the compression unit in a particular data block row of a particular data block of the data blocks; wherein the portion of the compression unit comprises compressed data from a plurality of the table rows from the particular table, wherein the particular data block row thus comprises compressed data from the plurality of the table rows; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 27)
-
-
16. A method comprising:
-
storing data for database tables in a storage device, the storage device comprising memory organized into data blocks conforming to a data block format, the data block format including data block metadata and one or more data block rows; storing uncompressed data for a first plurality of the database tables in a first plurality of the data blocks, each data block of the first plurality of the data blocks comprising a plurality of data block rows, each data block row of the plurality of data block rows storing data from only one table row from the first plurality of the database tables; storing, within a second plurality of data blocks of a database, compression units comprising compressed data from tables, the compression units comprising compressed data structured in a format that does not conform to the data block format; determining that execution of a database request requires access to data from at least one or more table rows; retrieving one or more data blocks to which the one or more table rows have been mapped; determining whether the one or more retrieved data blocks store any of the table rows in one or more of the compression units; and responsive to determining that the one or more retrieved data blocks store data for a particular table row of the one or more table rows in one or more particular compression units of the compression units; based at least partially on information in the one or more retrieved data blocks, locating at least a portion of the compression unit in a data block row of the one or more retrieved data blocks, generating a decompressed portion of the compression unit, comprising data from a plurality of table rows, by decompressing the portion of the compression unit from the located data block row; locating the data for the particular table row in the decompressed portion of the compression unit; reading one or more items indicated by the database request from the data for the particular table row; and executing the database request based at least on the one or more items that were read from the data for the particular table row; wherein the method is performed by one or more computing devices. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A method comprising:
-
storing data for database tables in a storage device, the storage device comprising a. memory organized into data blocks conforming to a data block format, the data block format including data block metadata and one or more data block rows; compressing a particular database table within a particular compression unit, the compression unit comprising compressed data from the table structured in a format that is different than the data block format; dividing the particular compression unit into portions; based on the divided portions, storing the compression unit within a plurality of data blocks of a database, wherein the compression unit spans the plurality of data blocks, each data block of the plurality of data blocks comprising one or more different portions of the divided portions; receiving a request whose execution requires access to first data from the particular database table; determining that the database stores the first data in the particular compression unit; determining that the first data is stored in a first set of one or more portions of the divided portions of the particular compression unit; retrieving the first set of one or more portions from one or more data blocks of the plurality of data blocks; decompressing the first set of one or more portions, thereby yielding one or more decompressed portions of the particular compression unit; locating the first data in the decompressed portion of the particular compression unit; executing the request based at least partially on one or more items from the first data; wherein the method is performed by one or more computing devices. - View Dependent Claims (23, 24, 25, 26)
-
-
28. One or more non-transitory computer-readable storage media storing instructions that, when executed by one or more computing devices, cause performance of:
-
storing data for database tables in a storage device, the storage device comprising a memory organized into data blocks conforming to a data block format, the data block format including data block metadata and one or more data block rows; storing uncompressed data for a first plurality of the database tables in a first set of the data blocks, each data block of the first set of the data blocks comprising a plurality of data block rows, each data block row of the plurality of data block rows storing data from only one table row from the first plurality of the database tables; for a particular table that is not in the first plurality of the database tables, generating compressed data by compressing data from multiple table rows in the particular table, wherein said compressing comprises generating the compressed data and structuring the compressed data in a compression unit that does not conform to the data block format; storing the compression unit in one or more data block rows of a second set of one or more of the data blocks, said storing comprising storing at least a portion of the compression unit in a particular data block row of a particular data block of the data blocks; wherein the portion of the compression unit comprises compressed data from a plurality of the table rows from the particular table, wherein the particular data block row thus comprises compressed data from the plurality of the table rows. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 54)
-
-
43. One or more non-transitory computer-readable storage media storing instructions that, when executed by one or more computing devices, cause performance of:
-
storing data for database tables in a storage device, the storage device comprising a memory organized into data blocks conforming to a data block format, the data block format including data block metadata and one or more data block rows; storing uncompressed data for a first plurality of the database tables in a first plurality of the data blocks, each data block of the first plurality of the data blocks comprising a plurality of data block rows, each data block row of the plurality of data block rows storing data from only one table row from the first plurality of the database tables; storing, within a second plurality of data blocks of a database, compression units comprising compressed data from tables, the compression units comprising compressed data structured in a format that does not conform to the data block format; determining that execution of a database request requires access to data from at least one or more table rows; retrieving one or more data blocks to which the one or more table rows have been mapped; determining whether the one or more retrieved data blocks store any of the table rows in one or more of the compression units; and responsive to determining that the one or more retrieved data blocks store data for a particular table row of the one or more table rows in one or more particular compression units of the compression units; based at least partially on information in the one or more retrieved data blocks, locating at least a portion of the compression unit in a data block row of the one or more retrieved data blocks, generating a decompressed portion of the compression unit, comprising data from a plurality of table rows, by decompressing the portion of the compression unit from the located data block row; locating the data for the particular table row in the decompressed portion of the compression unit; reading one or more items indicated by the database request from the data for the particular table row; and executing the database request based at least on the one or more items that were read from the data for the particular table row. - View Dependent Claims (44, 45, 46, 47, 48)
-
-
49. One or more non-transitory computer-readable storage media storing instructions that, when executed by one or more computing devices, cause performance of:
-
storing data for database tables in a storage device, the storage device comprising a memory organized into data blocks conforming to a data block format, the data block format including data block metadata and one or more data block rows; compressing a particular database table within a particular compression unit, the compression unit comprising compressed data from the table structured in a format that is different than the data block format; dividing the particular compression unit into portions; based on the divided portions, storing the compression unit within a plurality of data blocks of a database, wherein the compression unit spans the plurality of data blocks, each data block of the plurality of data blocks comprising one or more different portions of the divided portions; receiving a request whose execution requires access to first data from the particular database table; determining that the database stores the first data in the particular compression unit; determining that the first data is stored in a first set of one or more portions of the divided portions of the particular compression unit; retrieving the first set of one or more portions from one or more data blocks of the plurality of data blocks; decompressing the first set of one or more portions, thereby yielding one or more decompressed portions of the particular compression unit; locating the first data in the decompressed portion of the particular compression unit; executing the request based at least partially on one or more items from the first data. - View Dependent Claims (50, 51, 52, 53)
-
Specification