Selecting a compression technique
First Claim
Patent Images
1. A method comprising:
- determining demographics for data;
determining a compression ratio (“
CR”
) of each of a plurality of compression techniques, wherein CR is a size of the data before compression divided into a predicted size of the data after compression, wherein the predicted size of the data after compression is determined as a function of the determined demographics;
determining an access efficiency of each of the compression techniques as a function of the determined demographics;
ranking the compression techniques by CR and access efficiency;
selecting a compression technique based on the ranking;
compressing the data using the selected compression technique; and
storing the compressed data;
wherein each of the plurality of compression techniques stores data using;
a information data structure that specifies information about the data, anda value data structure that stores the values of the data;
wherein access efficiency is defined to have four categories;
a first category in which;
the information data structure is accessed directly, andthe value data structure is accessed directly,a second category in which;
the information data structure is accessed directly, andthe value data structure is accessed sequentially,a third category in which;
the information data structure is accessed sequentially, andthe value data structure is accessed directly, anda fourth category in which;
the information data structure is accessed sequentially, andthe value data structure is accessed sequentially.
1 Assignment
0 Petitions
Accused Products
Abstract
Demographics for data are determined. A compression ratio (“CR”) is determined for each of a plurality of compression techniques. CR is a size of the data before compression divided into a predicted size of the data after compression. The predicted size of the data after compression is determined as a function of the determined demographics. An access efficiency of each of the compression techniques is determined as a function of the determined demographics. The compression techniques are ranked by CR and access efficiency. A compression technique is selected based on the ranking. The data is compressed using the selected compression technique. The compressed data is stored.
22 Citations
18 Claims
-
1. A method comprising:
-
determining demographics for data; determining a compression ratio (“
CR”
) of each of a plurality of compression techniques, wherein CR is a size of the data before compression divided into a predicted size of the data after compression, wherein the predicted size of the data after compression is determined as a function of the determined demographics;determining an access efficiency of each of the compression techniques as a function of the determined demographics; ranking the compression techniques by CR and access efficiency; selecting a compression technique based on the ranking; compressing the data using the selected compression technique; and storing the compressed data; wherein each of the plurality of compression techniques stores data using; a information data structure that specifies information about the data, and a value data structure that stores the values of the data; wherein access efficiency is defined to have four categories; a first category in which; the information data structure is accessed directly, and the value data structure is accessed directly, a second category in which; the information data structure is accessed directly, and the value data structure is accessed sequentially, a third category in which; the information data structure is accessed sequentially, and the value data structure is accessed directly, and a fourth category in which; the information data structure is accessed sequentially, and the value data structure is accessed sequentially. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A database system comprising:
-
one or more nodes; a plurality of CPUs, each of the one or more nodes providing access to one or more CPUs; a plurality of virtual processes, each of the one or more CPUs providing access to one or more virtual processes; each virtual process configured to manage data, including rows from the set of database table rows, stored in one of a plurality of data-storage facilities; a process to; determine demographics for data; determine a compression ratio (“
CR”
) of each of a plurality of compression techniques, wherein CR is a size of the data before compression divided into a predicted size of the data after compression, wherein the predicted size of the data after compression is determined as a function of the determined demographics;determine an access efficiency of each of the compression techniques as a function of the determined demographics; rank the compression techniques by CR and access efficiency; select a compression technique based on the ranking; compress the data using the selected compression technique; and store the compressed data; wherein each of the plurality of compression techniques stores data using; a information data structure that specifies information about the data, and a value data structure that stores the values of the data; wherein access efficiency is defined to have four categories; a first category in which; the information data structure is accessed directly, and the value data structure is accessed directly, a second category in which; the information data structure is accessed directly, and the value data structure is accessed sequentially, a third category in which; the information data structure is accessed sequentially, and the value data structure is accessed directly, and a fourth category in which; the information data structure is accessed sequentially, and the value data structure is accessed sequentially. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program stored in a non-transitory computer readable storage medium, the program comprising executable instructions that cause a computer to:
-
determine demographics for data; determine a compression ratio (“
CR”
) of each of a plurality of compression techniques,wherein CR is a size of the data before compression divided into a predicted size of the data after compression, wherein the predicted size of the data after compression is determined as a function of the determined demographics; determine an access efficiency of each of the compression techniques as a function of the determined demographics; rank the compression techniques by CR and access efficiency; select a compression technique based on the ranking; compress the data using the selected compression technique; and store the compressed data; wherein each of the plurality of compression techniques stores data using; a information data structure that specifies information about the data, and a value data structure that stores the values of the data; wherein access efficiency is defined to have four categories; a first category in which; the information data structure is accessed directly, and the value data structure is accessed directly, a second category in which; the information data structure is accessed directly, and the value data structure is accessed sequentially, a third category in which; the information data structure is accessed sequentially, and the value data structure is accessed directly, and a fourth category in which; the information data structure is accessed sequentially, and the value data structure is accessed sequentially. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification