Adaptive cell-specific dictionaries for frequency-partitioned multi-dimensional data
First Claim
Patent Images
1. A computer-based method comprising the steps of:
- (a) maintaining a global dictionary for each column in a plurality of columns associated with a database table;
(b) partitioning each column in said plurality of columns into one or more partitions;
(c) forming a plurality of cells, each cell formed by grouping a plurality of partitions;
(d) maintaining at least one cell-specific dictionary for at least one cell among said plurality of cells;
(e) receiving a query to compile and execute against said database table;
(f) adaptively applying said at least one cell-specific dictionary for said at least one cell when said at least one cell crosses a boundary of column partitions or when compression can be improved;
(g) compiling and executing said received query based on said application of at least one cell-specific dictionary for said at least one cell of said database table and said global dictionary for remainder of cells or columns;
(h) outputting said results of said compilation and execution step, andwherein said partitioning comprising partitioning each column individually to form said plurality of cells as cross products, and super-cells are formed to avoid splitting cells that otherwise fall below a predetermined cell threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
A cell-specific dictionary is applied adaptively to adequate cells, where the cell-specific dictionary subsequently optimizes the handling of frequency-partitioned multi-dimensional data. This includes improved data partitioning with super cells or adjusting resulting cells by sub-dividing very large cells and merging multiple small cells, both of which avoid the highly skewed data distribution in cells and improve the query processing. In addition, more efficient encoding is taught within a cell in case the distinct values that actually appear in that cell are much smaller than the size of the column dictionary.
-
Citations
23 Claims
-
1. A computer-based method comprising the steps of:
-
(a) maintaining a global dictionary for each column in a plurality of columns associated with a database table; (b) partitioning each column in said plurality of columns into one or more partitions; (c) forming a plurality of cells, each cell formed by grouping a plurality of partitions; (d) maintaining at least one cell-specific dictionary for at least one cell among said plurality of cells; (e) receiving a query to compile and execute against said database table; (f) adaptively applying said at least one cell-specific dictionary for said at least one cell when said at least one cell crosses a boundary of column partitions or when compression can be improved; (g) compiling and executing said received query based on said application of at least one cell-specific dictionary for said at least one cell of said database table and said global dictionary for remainder of cells or columns; (h) outputting said results of said compilation and execution step, and wherein said partitioning comprising partitioning each column individually to form said plurality of cells as cross products, and super-cells are formed to avoid splitting cells that otherwise fall below a predetermined cell threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-based method comprising the steps of:
-
(a) maintaining a global dictionary for each column in a plurality of columns associated with a database table; (b) partitioning said each column in said plurality of columns into one or more partitions; (c) forming a plurality of cells, each cell formed by grouping a plurality of partitions; (d) maintaining at least one cell-specific dictionary for at least one cell among said plurality of cells; (e) processing any of the following post-processing cell adjustments;
split processing or merge processing,said split processing comprising;
splitting at least a single cell among said plurality of cells into two sub-cells and forming a cell-specific dictionary for each of said sub-cells;said merge processing comprising;
merging at least two cells among said plurality of cells into a larger cell and forming a cell-specific dictionary as a superset of said at least two cells'"'"' dictionaries;(f) receiving a query to compile and execute against said database table; (g) adaptively applying said cell-specific dictionary/dictionaries for sub-cells, merged sub-cells, or split cells when at least one cell crosses a boundary of column partitions or compression can be improved; (h) compiling and executing said received query based on said adaptive application of said cell-specific dictionary/dictionaries for sub-cells, merged sub-cells, or split cells and said global dictionary for remainder of cell(s) or columns; and (i) outputting said results of said compilation and execution step. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. An article of manufacture comprising computer storage medium storing computer readable program code which, when executed by a computer, implements a computer-based method comprising the steps of:
-
(a) maintaining a global dictionary for each column in a plurality of columns associated with a database table; (b) partitioning each column in said plurality of columns into one or more partitions; (c) forming a plurality of cells, each cell formed by grouping a plurality of partitions; (d) maintaining at least one cell-specific dictionary for at least one cell among said plurality of cells; (e) receiving a query to compile and execute against said database table; (f) adaptively applying said at least one cell-specific dictionary for said at least one cell when said at least one cell crosses a boundary of column partitions or when compression can be improved; (g) compiling and executing said received query based on said application of at least one cell-specific dictionary for said at least one cell of said database table and said global dictionary for remainder of cells or columns; (h) outputting said results of said compilation and execution step, and wherein said partitioning comprising partitioning each column individually to form said plurality of cells as cross products, and super-cells are formed to avoid splitting cells that otherwise fall below a predetermined cell threshold. - View Dependent Claims (16, 17, 18)
-
-
19. An article of manufacture comprising computer storage medium storing computer readable program code which, when executed by a computer, implements a computer-based method comprising the steps of:
-
(a) maintaining a global dictionary for each column in a plurality of columns associated with a database table; (b) partitioning said each column in said plurality of columns into one or more partitions; (c) forming a plurality of cells, each cell formed by grouping a plurality of partitions; (d) maintaining at least one cell-specific dictionary for at least one cell among said plurality of cells; (e) processing any of the following post-processing cell adjustments;
split processing or merge processing,said split processing comprising;
splitting at least a single cell among said plurality of cells into two sub-cells and forming a cell-specific dictionary for each of said sub-cells;said merge processing comprising;
merging at least two cells among said plurality of cells into a larger cell and forming a cell-specific dictionary as a superset of said at least two cells'"'"' dictionaries;(f) receiving a query to compile and execute against said database table; (g) adaptively applying said cell-specific dictionary/dictionaries for sub-cells, merged sub-cells, or split cells when at least one cell crosses a boundary of column partitions or compression can be improved; (h) compiling and executing said received query based on said adaptive application of said cell-specific dictionary/dictionaries for sub-cells, merged sub-cells, or split cells and said global dictionary for remainder of cell(s) or columns; and (i) outputting said results of said compilation and execution step. - View Dependent Claims (20, 21, 22, 23)
-
Specification