Method and mechanism for storing and accessing data
First Claim
1. A method for storing data in a compressed format, comprising:
- receiving a request to store data onto one or more database blocks on disk;
analyzing the data to determine the existence of a redundant data item;
reordering the data to change the order of two or more columns within the data;
creating a symbol structure to store the reordered redundant data item;
formatting an on-disk data structure corresponding to the data; and
associating the on-disk data structure with a reference to the redundant data item stored in the symbol structure, wherein the on-disk data structure does not explicitly include a copy of the redundant data item.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and mechanism is disclosed for implementing storage and retrieval of data in a computing system. Data compression is performed on stored data by reducing or eliminating duplicate values in a database block. Duplicated values are eliminated within the set of data that is to be stored within a particular data storage unit. Rather than writing the duplicated data values to the data storage unit, the on-disk data is configured to reference a symbol table a single copy of each duplicated data value. Column reordering may be performed in an embodiment to further improve compression efficiency. The column reordering may be performed to allow efficient removal of trailing NULL values from on-disk storage.
167 Citations
57 Claims
-
1. A method for storing data in a compressed format, comprising:
-
receiving a request to store data onto one or more database blocks on disk; analyzing the data to determine the existence of a redundant data item; reordering the data to change the order of two or more columns within the data; creating a symbol structure to store the reordered redundant data item; formatting an on-disk data structure corresponding to the data; and associating the on-disk data structure with a reference to the redundant data item stored in the symbol structure, wherein the on-disk data structure does not explicitly include a copy of the redundant data item. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for retrieving data stored in a compressed format, comprising:
-
receiving a request to retrieve data from one or more database blocks on disk; analyzing the request to determine the one or more data blocks to access; analyzing the one or more database blocks to determine whether compression is being applied; retrieving the on-disk data structure corresponding to the data; and for a redundant data item referenced by the on-disk data structure, retrieving the redundant data item from a symbol structure, the symbol structure including one or more entries that reference another entry in the symbol structure. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26)
-
-
27. A method for storing data, comprising:
-
receiving a request to store data, the data comprising a plurality of columns; reordering the plurality of columns to increase the likelihood of trailing NULL values; and creating a stored version of the data, wherein the stored version of the data does not include the trailing NULL values. - View Dependent Claims (28, 29, 30)
-
-
31. A structure for storing data on a database block, comprising:
-
a symbol structure for storing a redundant data item on disk, the symbol structure including one or more entries that reference another entry in the symbol structure; and an on-disk data structure on disk comprising a reference to the redundant data item in the symbol structure and which does not store redundant data. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
-
44. A computer program product comprising a computer usable medium having executable code to execute a method for storing data in a compressed format, the method comprising the steps of:
-
receiving a request to store data onto one or more database blocks on disk; analyzing the data to determine the existence of a redundant data item; reordering the data to change the order of two or more columns within the data; creating a symbol structure to store the reordered redundant data item; formatting an on-disk data structure corresponding to the data; and associating the on-disk data structure with a reference to the redundant data item stored in the symbol structure, wherein the on-disk data structure does not explicitly include a copy of the redundant data item. - View Dependent Claims (45, 46, 47, 48)
-
-
49. A system for storing data in a compressed format, comprising:
-
means for receiving a request to store data onto one or more database blocks on disk; means for analyzing the data to determine the existence of a redundant data item; means for reordering the data to change the order of two or more columns within the data; means for creating a symbol structure to store the reordered redundant data item; means for formatting an on-disk data structure corresponding to the data; and means for associating the on-disk data structure with a reference to the redundant data item stored in the symbol structure, wherein the on-disk data structure does not explicitly include a copy of the redundant data item.
-
-
50. A computer program product comprising a computer usable medium having executable code to execute a method for retrieving data stored in a compressed format, the method comprising the steps of:
-
receiving a request to retrieve data from one or more database blocks on disk; analyzing the request to determine the one or more data blocks to access; analyzing the one or more database blocks to determine whether compression is being applied; retrieving the on-disk data structure corresponding to the data; and for a redundant data item referenced by the on-disk data structure, retrieving the redundant data item from a symbol structure, the symbol structure including one or more entries that reference another entry in the symbol structure. - View Dependent Claims (51, 52, 53, 54)
-
-
55. A system for retrieving data from a compressed format, comprising:
-
means for receiving a request to retrieve data from one or more database blocks on disk; means for analyzing the request to determine the one or more data blocks to access; means for analyzing the one or more database blocks to determine whether compression is being applied; and means for retrieving the on-disk data structure corresponding to the data; and means for, when a redundant data item referenced by the on-disk data structure, retrieving the redundant data item from a symbol structure, the symbol structure including one or more entries that reference another entry in the symbol structure.
-
-
56. A computer program product comprising a computer usable medium having executable code to execute a method for storing data in a compressed format, the method comprising the steps of:
-
receiving a request to store data, the data comprising a plurality of columns; reordering the plurality of columns to increase the likelihood of trailing NULL values; and creating a stored version of the data, wherein the stored version of the data does not include the trailing NULL values. - View Dependent Claims (57)
-
Specification