Method and apparatus for adding data storage bins to a stored computer database while minimizing movement of data and balancing data distribution
First Claim
1. In a database system having a stored database partitioned into a plurality of buckets initially distributed into storage in round robin fashion in n0 bins and subsequently in nj bins after a jth time that new bins have been added to the stored database, each bucket being associated with a bucket identification "x", a database storage manager for generating a bin identification "y" for the buckets after the mth addition of new bins, comprising:
- a mapping module that includes;
first means for decrementing an index counter variable j=m to 0 by -1 ;
second means for defining a variable y=x mod nj ;
third means for determining whether y is equal to or greater than nj-1 ; and
fourth means responsive to the third means for establishing y as the bin identification when y is equal to or greater than nj-1 and;
means for assigning the buckets to the nm bins; and
means for obtaining from storage records in the stored database using the bin identification.
1 Assignment
0 Petitions
Accused Products
Abstract
In a database system that stores database objects in partitioned mode using bins to represent storage locations at which individual records of an object are stored, after they have been partitioned into logical buckets, a cascaded round-robin mapping method assigns buckets to bins evenly, while minimizing the movement of buckets when new bins are added and while minimizing memory overhead requirements. The method includes entering a do loop for an index counter variable j=m to 0 by -1, wherein "m" is the number of times new bins have been added since the last database reorganization. A variable y is set equal to x modulo nj, wherein nj is the number of bins after the jth database expansion. If y≧nj-1, y is established to be the bin identification. Buckets are moved to populate new bins based on modulo n+k, wherein k is the number of bins added in the current expansion.
73 Citations
23 Claims
-
1. In a database system having a stored database partitioned into a plurality of buckets initially distributed into storage in round robin fashion in n0 bins and subsequently in nj bins after a jth time that new bins have been added to the stored database, each bucket being associated with a bucket identification "x", a database storage manager for generating a bin identification "y" for the buckets after the mth addition of new bins, comprising:
-
a mapping module that includes; first means for decrementing an index counter variable j=m to 0 by -1 ; second means for defining a variable y=x mod nj ; third means for determining whether y is equal to or greater than nj-1 ; and fourth means responsive to the third means for establishing y as the bin identification when y is equal to or greater than nj-1 and; means for assigning the buckets to the nm bins; and means for obtaining from storage records in the stored database using the bin identification. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method for managing storage of a database, the storage being divided into a plurality of bins, the method comprising populating "k" newly added bins in the storage with records of the database partitioned among "n" preexisting bins, each record having a bucket identification signifying a bin of the plurality of bins, and moving a record to a newly added bin if the record'"'"'s bucket identification modulo the sum of "n" and "k" signifies the newly added bin, wherein the records are clustered in the database storage and accessible by respective logical bucket identifications, and the method further comprises:
-
for each of "n" existing bins, wherein "n" is a natural number, entering a do loop for an index counter variable "i"=0 to (B-1), wherein the database storage includes "B" buckets; determining whether the ith bucket is in the bin, and if so, setting an index counter variable "j" to be the index counter variable "i" modulo the sum of "n" and "k"; and when the index counter variable "j" is at least as great as "n", moving the ith bucket to the jth bin, and otherwise not moving the ith bucket.
-
-
7. A computer-implemented method for managing storage of a database the storage being divided into a plurality of bins, the method comprising populating "k" newly added bins in the storage with records of the database partitioned among "n" preexisting bins, each record having a bucket identification signifying a bin of the plurality of bins, and moving a record to a newly added bin if the record'"'"'s bucket identification modulo the sum of "n" and "k" signifies the newly added bin, wherein the records are not clustered or are not accessible by bucket identification, and the method further comprises:
-
for each of "n" existing bins, wherein "n" is a natural number, and for each record in the bin, defining an index counter variable "j" to be the bucket identification "x" associated with the record modulo the sum of "n" and "k"; determining whether the index counter variable "j" is at least as great as "n", and if so, moving the record to the jth bin, and otherwise not moving the record to the jth bin.
-
-
8. A computer-implemented method for managing storage of a database, the storage being divided into a plurality of bins, the method comprising populating "k" newly added bins in the storage with records of the database partitioned among "n" preexisting bins, each record having a bucket identification signifying a bin of the plurality of bins, and moving a record to a newly added bin if the record'"'"'s bucket identification modulo the sum of "n" and "k" signifies the newly added bin, wherein the database storage initially includes n0 bins, further comprising the steps of:
-
(a) adding new bins "m" times to the database storage, wherein "m" is a natural number, and then initially defining an index counter variable "j" to be equal to "m"; (b) defining a variable "y" to be the bucket identification "x" modulo nj ; (c) determining whether the variable "y" is at least equal to nj-1, and if so, defining the variable "y" to be the bin identification of the bucket having the bucket identification "x" and otherwise decrementing the index counter variable "j" by an integer number and repeating steps (b) and (c); (d) subsequently accessing buckets using the variable "y" as the relevant bin identification; (e) moving buckets to bins and mapping the buckets to the bins using steps (a)-(c). - View Dependent Claims (9, 10)
-
-
11. A computer program device comprising:
-
a computer program storage device readable by a digital computer; and a program means on the program storage device and including instructions executable by the digital computer for performing method steps for mapping to a bin in a database storage a data record bucket stored therein, the data record bucket containing data records of a database and having a bucket identification "x", while minimizing the movement of data when new bins are added to the database storage and while minimizing memory overhead, the method steps comprising; (a) adding new bins "m" times to a database storage initially containing n0 bins, wherein "m" is a natural number, and then initially defining an index counter variable "j" to be equal to "m"; (b) defining a variable "y" to be the bucket identification "x" modulo nj ; (c) determining whether the variable "y" is at least equal to nj-1, and if so, defining the variable "y" to be the bin identification of the bucket having the bucket identification "x" and otherwise decrementing the index counter variable "j" by an integer number and repeating steps (b) and (c); and (d) subsequently accessing the bucket using the variable "y" as the relevant bin identification. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. In a computer database storage divided into a plurality of bins, in which buckets map data to bins, a computer-implemented bucket-to-bin mapping method characterized by a sequence of numbers, wherein the sequence is smaller in size than a stored directory having a bin identification or location for every bucket in the database, the method including:
-
adding new bins to the plurality of bins; maintaining an even distribution of buckets to bins after each new bin addition, without requiring movement of data between bins other than populating new bins and without storing the bin identification or location for every bucket; and
,populating "k" newly added bins in the database storage with records of the database partitioned among "n" preexisting bins, each record having a bucket identification signifying a bin, and moving a record to a newly added bin if the record'"'"'s bucket identification modulo the sum of "n" and "k" signifies the newly added bin; wherein the records are clustered in the database storage and accessible by respective logical bucket identifications, and the method further comprises; for each of "n" existing bins, wherein "n" is a natural number, entering a do loop for an index counter variable "i"=0 to (B-1), wherein the database storage includes "B" buckets; determining whether the ith bucket is in the bin, and if so, setting an index counter variable "j" to be the index counter variable "i" modulo the sum of "n" and "k"; and when the index counter variable "j" is at least as great as "n", moving the ith bucket to the jth bin, and otherwise not moving the ith bucket.
-
-
19. In a computer database storage divided into a plurality of bins, in which buckets map data to bins, a computer-implemented bucket-to-bin mapping method characterized by a sequence of numbers, wherein the sequence is smaller in size than a stored directory having a bin identification or location for every bucket in the database, the method including:
-
adding new bins to the plurality of bins; maintaining an even distribution of buckets to bins after each new bin addition, without requiring movement of data between bins other than populating new bins and without storing the bin identification or location for every bucket; and
,populating "k" newly added bins in the database storage with records of the database partitioned among "n" preexisting bins, each record having a bucket identification signifying a bin, and moving a record to a newly added bin if the record'"'"'s bucket identification modulo the sum of "n" and "k" signifies the newly added bin; wherein the records are not clustered or are not accessible by bucket identification, and the method further comprises; for each of "n" existing bins, wherein "n" is a natural number, and for each record in the bin, defining an index counter variable "j" to be the bucket identification "x" associated with the record modulo the sum of "n" and "k"; determining whether the index counter variable "j" is at least as great as "n", and if so, moving the record to the jth bin, and otherwise not moving the record to the jth bin.
-
-
20. In a computer database storage divided into a plurality of bins, in which buckets map data to bins, a computer-implemented bucket-to-bin mapping method characterized by a sequence of numbers, wherein the sequence is smaller in size than a stored directory having a bin identification or location for every bucket in the database, the method including:
-
adding new bins to the plurality of bins; maintaining an even distribution of buckets to bins after each new bin addition, without requiring movement of data between bins other than populating new bins and without storing the bin identification or location for every bucket; and
,populating "k" newly added bins in the database storage with records of the database partitioned among "n" preexisting bins, each record having a bucket identification signifying a bin, and moving a record to a newly added bin if the record'"'"'s bucket identification modulo the sum of "n" and "k" signifies the newly added bin; wherein the database storage initially includes n0 bins, and the method further comprises; (a) adding new bins "m" times to the database storage, wherein "m" is a natural number, and then initially defining an index counter variable "j" to be equal to "m"; (b) defining a variable "y" to be the bucket identification "x" modulo nj ; (c) determining whether the variable "y" is at least equal to nj-1, and if so, defining the variable "y" to be the bin identification of the bucket having the bucket identification "x" and otherwise decrementing the index counter variable "j" by an integer number and repeating steps (b) and (c); (d) subsequently accessing buckets using the variable "y" as the relevant bin identification; (e) moving buckets to bins and mapping the buckets to the bins using steps (a)-(c). - View Dependent Claims (21, 22, 23)
-
Specification