Method and system for partitioning database
First Claim
1. A computer-executable method of managing a database on a data storage system, wherein the database includes one or more entries wherein each of the one or more entries interacts with one or more transactions, the computer-executable method comprising:
- grouping the one or more entries of the database into one or more entry groups, wherein each of the entry groups are accessed together by a transaction of the one or more transactions;
determining a partition solution such that the extent of data skew and the extent of workload skew of the system resulting from the partition solution is below a predetermined threshold;
dividing, based on the partition solution, each of the one or more entry groups into partitions, minimizing an amount each of the one or more transactions accesses more than one partition;
distributing each of the partitions among the one or more nodes of the data storage system; and
determining the performance by measuring the extent of data skew and workload skew of the data storage system and comparing to a threshold;
constructing a lookup table based on relationships between entries and nodes storing the one or more entries.
9 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to a method and system for partitioning a database. The method for partitioning a database comprises: grouping a plurality of entries in the database into one or more entry groups, so that entries in the same entry group are always accessed together by one or more transactions; and dividing the one or more entry groups into a set number of partitions, so that a total number of transactions that access across more than one partition is minimized. By means of the present invention, it is possible to obtain an efficient, flexible and convenient method for partitioning a database, thereby greatly improving the system performance.
11 Citations
12 Claims
-
1. A computer-executable method of managing a database on a data storage system, wherein the database includes one or more entries wherein each of the one or more entries interacts with one or more transactions, the computer-executable method comprising:
-
grouping the one or more entries of the database into one or more entry groups, wherein each of the entry groups are accessed together by a transaction of the one or more transactions; determining a partition solution such that the extent of data skew and the extent of workload skew of the system resulting from the partition solution is below a predetermined threshold; dividing, based on the partition solution, each of the one or more entry groups into partitions, minimizing an amount each of the one or more transactions accesses more than one partition; distributing each of the partitions among the one or more nodes of the data storage system; and determining the performance by measuring the extent of data skew and workload skew of the data storage system and comparing to a threshold; constructing a lookup table based on relationships between entries and nodes storing the one or more entries. - View Dependent Claims (2, 3, 4)
-
-
5. A system, comprising:
-
a data storage system, including memory and one or more processors, utilizing one or more data storage arrays to store a database, wherein the database includes one or more entries, wherein each of the one or more entries interacts with one or more transactions; and computer-executable logic encoded in memory of one or more computers in communication with the data storage system to manage the database on the data storage system, wherein the computer-executable program logic is configured for the execution of; grouping the one or more entries of the database into one or more entry groups, wherein each of the entry groups are accessed together by a transaction of the one or more transactions; determining a partition solution such that the extent of data skew and the extent of workload skew of the system resulting from the partition solution is below a predetermined threshold; dividing, based on the partition solution, each of the one or more entry groups into partitions, minimizing an amount each of the one or more transactions accesses more than one partition; distributing each of the partitions among the one or more nodes of the data storage system; and determining the performance by measuring the extent of data skew and workload skew of the data storage system and comparing to a thresholds; constructing a lookup table based on relationships between entries and nodes storing the one or more entries. - View Dependent Claims (6, 7, 8)
-
-
9. A computer program product for managing a database on a data storage system, wherein the database includes one or more entries wherein each of the one or more entries interacts with one or more transactions, the computer program product comprising:
a non-transitory computer readable medium encoded with computer-executable program code for managing the database on the data storage system, the code configured to enable the execution of; grouping the one or more entries of the database into one or more entry groups, wherein each of the entry groups are accessed together by a transaction of the one or more transactions; determining a partition solution such that the extent of data skew and the extent of workload skew of the system resulting from the partition solution is below a predetermined threshold; dividing, based on the partition solution, each of the one or more entry groups into partitions, minimizing an amount each of the one or more transactions accesses more than one partition; distributing each of the partitions among the one or more nodes of the data storage system; and determining the performance by measuring the extent of data skew and workload skew of the data storage system and comparing to a threshold; constructing a lookup table based on relationships between entries and nodes storing the one or more entries. - View Dependent Claims (10, 11, 12)
Specification