System and method for compression in a distributed column chunk data store
First Claim
1. A computer-implemented method for compressing a partitioned data table in a computer system, comprising:
- partitioning a data table into column chunks for storing on one or more storage servers;
applying data domain compression to one or more column chunks of the partitioned data table for compressing the one or more column chunks; and
storing the one or more compressed column chunks of the partitioned data table on the one or more storage servers.
5 Assignments
0 Petitions
Accused Products
Abstract
An improved system and method for compression in a distributed column chunk data store is provided. A distributed column chunk data store may be provided by multiple storage servers operably coupled to a network. A storage server provided may include a database engine for partitioning a data table into the column chunks for distributing across multiple storage servers, a storage shared memory for storing the column chunks during processing of semantic operations performed on the column chunks, and a storage services manager for striping column chunks of a partitioned data table across multiple storage servers. Any data table may be flexibly partitioned into column chunks using one or more columns with various partitioning methods. Domain specific compression may be applied to a column chunk to reduce storage requirements of column chunks and increase transmission speeds for sending column chunks between storage servers.
-
Citations
20 Claims
-
1. A computer-implemented method for compressing a partitioned data table in a computer system, comprising:
-
partitioning a data table into column chunks for storing on one or more storage servers;
applying data domain compression to one or more column chunks of the partitioned data table for compressing the one or more column chunks; and
storing the one or more compressed column chunks of the partitioned data table on the one or more storage servers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer-implemented method for compressing a partitioned data table in a computer system, comprising:
-
determining whether the data domain of values in a column chunk of a partitioned data table represents a range of numeric values;
determining the number of bits needed to represent the range of numeric values;
normalizing the numeric values of the column chunk to the bit representation of the range of numeric values;
packing each normalized numeric value into a bit vector to represent the column chunk;
compressing the bit vector to create a compressed column chunk; and
storing the compressed column chunk of the partitioned data table on one or more storage servers. - View Dependent Claims (15, 16, 17, 19)
-
-
18. A computer-implemented method for compressing a partitioned data table in a computer system, comprising:
-
determining whether the data domain of values in a column chunk of a partitioned data table represents key-value pairs;
decomposing the key-value pairs in the column chunk into one or more arrays of values;
compressing the key-value pairs in the one or more arrays of values to create a compressed column chunk; and
storing the compressed column chunk of the partitioned data table on one or more storage servers. - View Dependent Claims (20)
-
Specification