Floating point conversion for records of multidimensional database
First Claim
1. A method for compressing data in a plurality of records in a data store, comprising the actions of:
- (a) dividing the plurality of records into at least one segment, each segment including a predetermined number of records that are arranged in a table, each row of the table representing a separate record and each column representing a particular field in each record;
(b) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
(c) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
(d) for each column having integer data in the each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
(e) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for compressing and decompressing read only data in records that have a fixed size. A plurality of records are divided into segments having a predetermined size. For each segment, the records are arranged in a table with rows for each record and a column for each field in each record. The width of each column of repeated data is compressed to zero bits and the repeated data is referenced in a header of the segment. The width of each column of integer data is compressed to the minimum number of bits required to represent the largest integer value in the fields of the column. Floating point data in each column is converted to integer data and the width of the each column with converted integer data is set to the minimum width necessary to represent the largest converted integer in each column. The conversion to integer data is calculated for floating point and real numbers with a minimum precision exponent that is stored in the header for the segment. Floating point data is cleaned when it is converted to integer data. The information in the header is employed to decompress the compressed records in the segment. The decompression lends itself well to fast random access of secondary storage devices.
-
Citations
20 Claims
-
1. A method for compressing data in a plurality of records in a data store, comprising the actions of:
-
(a) dividing the plurality of records into at least one segment, each segment including a predetermined number of records that are arranged in a table, each row of the table representing a separate record and each column representing a particular field in each record;
(b) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
(c) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
(d) for each column having integer data in the each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
(e) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. Apparatus for compressing a plurality of records in a datastore, comprising:
-
(a) a load module for loading a plurality of records from the data store and dividing the loaded records into at least one segment, each segment including a predetermined number of records that are arranged in a table, each row representing a separate record and each column representing a particular field in each record;
(b) a compression module for compressing data in each column of the table of records, the compression module performing actions, including;
(i) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
(ii) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
(iii) for each column having integer data in each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
(iv) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed.
-
-
11. A computer-readable medium readable by a computing system and having instructions for executing a process for compressing a plurality of records, the process comprising the actions of:
-
(a) dividing a plurality of records into at least one segment, each segment including a predetermined number of records that are disposed in a table, each row representing a separate record and each column representing a particular field in each record;
(b) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
(c) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
(d) for each column having integer data in each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
(e) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A system comprising:
-
(a) a processor in communication with a device for a computer readable medium;
(b) an operating environment executing on the processor from the computer-readable medium;
(c) a data store; and
(d) an OLAP server executing under the control of the operating environment and performing actions, including;
(i) dividing a plurality of records into at least one segment, each segment including a predetermined number of records that are arranged in a table, each row representing a separate record and each column representing a particular field in each record;
(ii) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
(iii) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
(iv) for each column having integer data in each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
(v) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed. - View Dependent Claims (17, 18, 19, 20)
-
Specification