SYSTEM AND METHOD FOR OPTIMIZING DATA MIGRATION IN A PARTITIONED DATABASE
First Claim
1. A system for optimizing data distribution, the system comprising:
- at least one processor operatively connected to a memory for executing system components;
a database comprising a plurality of database partitions, wherein at least one of the plurality of database partitions includes a contiguous range of data from the database; and
a partition component configured to;
detect a partition size for the at least one of the plurality of database partitions that exceeds a size threshold;
split the at least one of the database partitions into at least a first and a second partition;
control a distribution of data within the first and the second partition based on a value for a database key associated with the data in the at least one of the plurality of database partitions, wherein controlling the distribution includes minimizing any data distributed to the second partition based on a maximum value for the database key associated with the data in the at least one of the plurality of database partitions.
3 Assignments
0 Petitions
Accused Products
Abstract
According to one aspect, provided is a horizontally scaled database architecture. Partition a database enables efficient distribution of data across a number of systems reducing processing costs associated with multiple machines. According to some aspects, the partitioned database can be manages as a single source interface to handle client requests. Further, it is realized that by identifying and testing key properties, horizontal scaling architectures can be implemented and operated with minimal overhead. In one embodiment, databases can be partitioned in an order preserving manner such that the overhead associated with moving the data for a given partition can be minimized during management of the data and/or database. In one embodiment, splits and migrations operations prioritize zero cost partitions, thereby, reducing computational burden associated with managing a partitioned database.
266 Citations
25 Claims
-
1. A system for optimizing data distribution, the system comprising:
-
at least one processor operatively connected to a memory for executing system components; a database comprising a plurality of database partitions, wherein at least one of the plurality of database partitions includes a contiguous range of data from the database; and a partition component configured to; detect a partition size for the at least one of the plurality of database partitions that exceeds a size threshold; split the at least one of the database partitions into at least a first and a second partition; control a distribution of data within the first and the second partition based on a value for a database key associated with the data in the at least one of the plurality of database partitions, wherein controlling the distribution includes minimizing any data distributed to the second partition based on a maximum value for the database key associated with the data in the at least one of the plurality of database partitions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer implemented method for optimizing data distribution, the method comprising acts of:
-
monitoring, by a computer system, a distributed database including a plurality of database partitions, wherein at least one of the plurality of database partitions includes a contiguous range of data from the database; detecting, by the computer system, a partition size of the at least one of the plurality of database partitions exceeds a size threshold; splitting, by the computer system, the at least one of the plurality of database partitions into at least a first and a second partition; controlling, by the computer system, a distribution of data within the first and the second partition based on a value for a database key associated with the data in the at least one of the plurality of database partitions, wherein controlling the distribution includes minimizing any data distributed to the second partition based on a maximum value for the database key associated with the data in the at least one of the plurality of database partitions. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A computer-readable storage medium having computer-readable instructions that, as a result of being executed by a computer, instruct the computer to perform a method for optimizing data distribution, the method comprising acts of:
-
monitoring a distributed database including a plurality of database partitions, wherein at least one of the plurality of database partitions includes a contiguous range of data from the database; detecting a partition size of the at least one of the plurality of database partitions exceeds a size threshold; splitting the at least one of the plurality of database partitions into at least a first and a second partition; controlling a distribution of data within the first and the second partition based on a value for a database key associated with the data in the at least one of the plurality of database partitions, wherein controlling the distribution includes minimizing any data distributed to the second partition based on a maximum value for the database key associated with the data in the at least one of the plurality of database partitions.
-
Specification