Floating point conversion for records of multidimensional database

US 6,424,972 B1
Filed: 06/22/2000
Issued: 07/23/2002
Est. Priority Date: 06/22/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A method for compressing data in a plurality of records in a data store, comprising the actions of:

(a) dividing the plurality of records into at least one segment, each segment including a predetermined number of records that are arranged in a table, each row of the table representing a separate record and each column representing a particular field in each record;

(b) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;

(c) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;

(d) for each column having integer data in the each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and

(e) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system for compressing and decompressing read only data in records that have a fixed size. A plurality of records are divided into segments having a predetermined size. For each segment, the records are arranged in a table with rows for each record and a column for each field in each record. The width of each column of repeated data is compressed to zero bits and the repeated data is referenced in a header of the segment. The width of each column of integer data is compressed to the minimum number of bits required to represent the largest integer value in the fields of the column. Floating point data in each column is converted to integer data and the width of the each column with converted integer data is set to the minimum width necessary to represent the largest converted integer in each column. The conversion to integer data is calculated for floating point and real numbers with a minimum precision exponent that is stored in the header for the segment. Floating point data is cleaned when it is converted to integer data. The information in the header is employed to decompress the compressed records in the segment. The decompression lends itself well to fast random access of secondary storage devices.

Citations

20 Claims

1. A method for compressing data in a plurality of records in a data store, comprising the actions of:
- (a) dividing the plurality of records into at least one segment, each segment including a predetermined number of records that are arranged in a table, each row of the table representing a separate record and each column representing a particular field in each record;
  
  (b) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
  
  (c) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
  
  (d) for each column having integer data in the each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
  
  (e) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein each record has a fixed size.
  - 3. The method of claim 1, wherein the data included in each record in the segment is read only data.
  - 4. The method of claim 1, further comprising determining a type of data associated with each column in the table, each column being associated with a field in each record.
  - 5. The method of claim 1, further comprising iteratively incrementing an exponent to determine the minimum precision necessary to convert the floating point data to integer data.
  - 6. The method of claim 5, wherein the iteration begins with the exponent representing a minimum value for data represented in a computer.
  - 7. The method of claim 6, wherein the exponent is incremented to a number no greater than the maximum value representable by the computer.
  - 8. The method of claim 7, wherein the minimum value and maximum value represent a floating point number.
  - 9. The method of claim 7, wherein the minimum value and maximum value represent a real number.

10. Apparatus for compressing a plurality of records in a datastore, comprising:
- (a) a load module for loading a plurality of records from the data store and dividing the loaded records into at least one segment, each segment including a predetermined number of records that are arranged in a table, each row representing a separate record and each column representing a particular field in each record;
  
  (b) a compression module for compressing data in each column of the table of records, the compression module performing actions, including;
  
  (i) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
  
  (ii) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
  
  (iii) for each column having integer data in each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
  
  (iv) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed.

11. A computer-readable medium readable by a computing system and having instructions for executing a process for compressing a plurality of records, the process comprising the actions of:
- (a) dividing a plurality of records into at least one segment, each segment including a predetermined number of records that are disposed in a table, each row representing a separate record and each column representing a particular field in each record;
  
  (b) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
  
  (c) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
  
  (d) for each column having integer data in each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
  
  (e) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The computer readable medium of claim 11, wherein each record has a fixed size and the data included in each record in the segment is read only data.
  - 13. The computer readable medium of claim 11, further comprising iteratively incrementing an exponent to determine the minimum precision necessary to convert the floating point data to integer data.
  - 14. The computer readable medium of claim 13, wherein the iteration begins with the exponent representing a minimum value for data represented in a computer.
  - 15. The computer readable medium of claim 13, wherein the exponent is incremented to a number no greater than the maximum value representable by the computer.

16. A system comprising:
- (a) a processor in communication with a device for a computer readable medium;
  
  (b) an operating environment executing on the processor from the computer-readable medium;
  
  (c) a data store; and
  
  (d) an OLAP server executing under the control of the operating environment and performing actions, including;
  
  (i) dividing a plurality of records into at least one segment, each segment including a predetermined number of records that are arranged in a table, each row representing a separate record and each column representing a particular field in each record;
  
  (ii) for each column having floating point data associated with each field, converting the floating point data into integer data for each field in the column;
  
  (iii) for each column having the same data repeated in each field, setting the width for the column having repeated data equal to zero bits;
  
  (iv) for each column having integer data in each field, setting the width for each column equal to the minimum number of bits necessary to represent the largest integer value in the column; and
  
  (v) including a header with the segment that indicates the predetermined number of records in the segment, the width of each column in the segment, the precision of the conversion from floating point data to integer data for each column having converted data, the repeated data for each column having a width set to zero bits, and the original width for each column such that the header of the segment can be employed to decompress the width of the columns and restore the original data in each field of a record that is accessed.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The system of claim 16, wherein each record has a fixed size and the data included in each record in the segment is read only data.
  - 18. The system of claim 16, further comprising iteratively incrementing an exponent to determine the minimum precision necessary to convert the floating point data to integer data.
  - 19. The system of claim 18, wherein the iteration begins with the exponent representing a minimum value for data represented in a computer.
  - 20. The system of claim 19, wherein the exponent is incremented to a number no greater than the maximum value representable by the computer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Petculescu, Cristian, Berger, Alexander, Netz, Amir
Primary Examiner(s)
RONES, CHARLES

Application Number

US09/602,610
Time in Patent Office

761 Days
Field of Search

707/3, 707/101, 707/102, 707/103
US Class Current

1/1
CPC Class Codes

G06F 16/258   Data format conversion from...

G06F 16/283   Multi-dimensional databases...

H03M 7/24   Conversion to or from float...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99942   Manipulating data structure...

Y10S 707/99943   Generating database or data...

Floating point conversion for records of multidimensional database

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Floating point conversion for records of multidimensional database

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links