FUNCTIONALITY OF DECOMPOSITION DATA SKEW IN ASYMMETRIC MASSIVELY PARALLEL PROCESSING DATABASES
First Claim
1. A method of restructuring a table having data skew in a computer system, the computer system storing data from a database in partitions on one or more nodes of the computer system, the method comprising:
- (a) storing original data values of a distribution key in a switch column added to the table,(b) replacing the original data values of the distribution key with modified data values that reduce the data skew in the table, and(c) partitioning the rows of the table across the nodes of the computer system based on the distribution key.
1 Assignment
0 Petitions
Accused Products
Abstract
Database queries are optimized through the functionality of decomposition data skew in an asymmetric massively parallel processing database system. A table having data skew is restructured by (1) storing original data values of a distribution key in a special switch column added to the table, (2) replacing the original data values of the distribution key with modified data values such as randomly generated data values, and (3) partitioning the rows across the nodes of the asymmetric massively parallel processing database system based on the distribution key. The original data values that are stored and replaced may only comprise a subset of the original data values that cause data skew in the table. Data skew is reduced, which improves performance, yet the original data values remain available, which reduces the impact on collocated joins.
-
Citations
21 Claims
-
1. A method of restructuring a table having data skew in a computer system, the computer system storing data from a database in partitions on one or more nodes of the computer system, the method comprising:
-
(a) storing original data values of a distribution key in a switch column added to the table, (b) replacing the original data values of the distribution key with modified data values that reduce the data skew in the table, and (c) partitioning the rows of the table across the nodes of the computer system based on the distribution key. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for restructuring a table having the data skew, comprising:
-
a computer system for storing data from a database in partitions on one or more nodes of the computer system; and a process performed by the computer system, the process configured to; (a) store original data values of a distribution key in a switch column added to the table, (b) replace the original data values of the distribution key with modified data values that reduce the data skew in the table, and (c) partition the rows of the table across the nodes of the computer system based on the distribution key. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. An article of manufacture comprising a computer readable storage medium encoded with computer program instructions which, when accessed by a computer system storing data from a database in partitions on one or more nodes of the computer system, cause the computer system to operate as a specially programmed computer system, executing a method for restructuring a table having the data skew, the method comprising:
-
(a) storing original data values of a distribution key in a switch column added to the table, (b) replacing the original data values of the distribution key with modified data values that reduce the data skew in the table, and (c) partitioning the rows of the table across the nodes of the computer system based on the distribution key. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification