COMPOSITE HASH AND LIST PARTITIONING OF DATABASE TABLES
First Claim
1. A method for reorganizing a table of a database, comprising:
- providing a data storage cluster including a first node and a second node;
storing the table in the data storage cluster with a first partition comprising a first set of rows in the first node and a second partition comprising a second set of rows in the second node;
modifying the data storage cluster to include a third node for storing data from the table;
with a storage engine managing the data storage cluster, adding a third partition to the table including using a partitioning mechanism to create a distribution mapping for data elements in the first, second, and third partitions;
copying a portion of the first and second rows of the table from both the first and second nodes to the third partition of the third node based on the distribution mapping; and
deleting the copied portion of the first and second rows of the table from the first and second nodes, wherein the distribution mapping calls for copying from the first and second nodes and not copying to the first and second nodes.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for partitioning during an online node add. The method includes providing a data storage cluster with first and second nodes, and storing a table of data in the data storage cluster with a first partition storing a set of rows or data elements in the first node and a second partition storing a set of rows or data elements in the second node. The method includes adding a third node to the cluster and adding a third partition to the table using a partitioning mechanism to create a distribution mapping for data elements in the first, second, and third partitions. The distribution mapping provides substantially uniform distribution of the data elements over the first, second, and third partitions by the partitioning mechanism using modulo hash partitioning as a function of data elements or by combining hash and list partitioning such that data is retained on the original partitions.
199 Citations
19 Claims
-
1. A method for reorganizing a table of a database, comprising:
-
providing a data storage cluster including a first node and a second node; storing the table in the data storage cluster with a first partition comprising a first set of rows in the first node and a second partition comprising a second set of rows in the second node; modifying the data storage cluster to include a third node for storing data from the table; with a storage engine managing the data storage cluster, adding a third partition to the table including using a partitioning mechanism to create a distribution mapping for data elements in the first, second, and third partitions; copying a portion of the first and second rows of the table from both the first and second nodes to the third partition of the third node based on the distribution mapping; and deleting the copied portion of the first and second rows of the table from the first and second nodes, wherein the distribution mapping calls for copying from the first and second nodes and not copying to the first and second nodes. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A data storage system for storing rows of data in a table, comprising:
-
a server running a storage engine including a partitioning module; a set of data nodes managed by the storage engine; and a table of data horizontally partitioned with a partition with a set of rows in each of the data nodes, wherein the storage engine operates the partitioning module to generate a distribution mapping defining a new partitioning of the data table when a new data node is added to the set of data nodes, the distribution mapping providing uniform distribution of rows of the data across partitions in the new partitioning and retaining a subset of the rows of the data in each of the original ones of the data nodes. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A computer program product including a computer useable medium with computer readable code embodied on the computer useable medium, the computer readable code comprising:
-
computer readable program code devices configured to cause a computer to add a node to a data storage cluster, wherein the data storage cluster stores a table of data in a horizontally partitioned manner over a number of nodes according to a first distribution mapping; computer readable program code devices configured to cause the computer to create a second distribution mapping defining partitioning of the table of data including an additional partition in the added node; and computer readable program code devices configured to cause the computer to copy one or more rows associated with the table of data from the number of nodes to the added node, the copied rows being defined by the second distribution mapping and the copying excluding copying between the number of nodes, whereby data is retained on the number of nodes. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
Specification