Interval-partitioning method for multidimensional data
First Claim
1. A method of partitioning data regardless of the sparsity of said data on to computer memory or computer storage media with one or more keys or groups of keys from said data such that the selectivity produced by any combination of said keys or said groups of one or more keys can be precisely controlled comprising the steps of:
- a) determining the number of records from said data to be packed into each unit partitionb) determining selectivity of each said key or said group of one or more keys chosen from said composite key so that the combined selectivity of all said keys or said groups of one or more keys is equal to the total number of unit partitions in said datac) arbitrarily choosing a first component key or group of one or more component keys from said chosen component keys in said composite keyd) partitioning said data by said first key or group of one or more keys of said chosen component keys possibly followed by one or more of the other component keys in said composite keye) creating a non-dense index formed from said first key or group of one or more keys and possibly followed by one or more of the other component keys in said composite key aligned with the boundaries of said partitions created in step d)f) choosing a new first key or group of one or more keys from said chosen component keys in said composite keyg) repeating steps d), e), and f) within each said partition created from the previous iteration of step d) as many times as required to produce said total number of unit partitions contained within said data thereby producing an organization of said data whose said selectivity of combinations of said keys can be precisely controlled by said selectivity for each said key or group of one or more said keys as determined in step b).
0 Assignments
0 Petitions
Accused Products
Abstract
A data structure that uses one or more keys which are members of a multidimensional composite key to partition sparse data (FIG. 2A, FIG. 2B) into a hypercube like structure (FIG. 1). Only the data structure produced by the method in this invention is actually better than a hypercube since all involved keys can have true symmetric access as opposed to just the keys that have the highest sort order significance. Equivalent query efficiency could only be otherwise obtained through storing multiple copies of the sparse data sorted on different permutations of the member keys of the multidimensional composite key (FIG. 2C). The data structure has two major components: the partitioned data with or without the boundary keys and the non-dense index necessarily containing the boundary keys (FIG. 3). This data structure and partitioning method can be used to partition multidimensional data across physical hardware storage devices and parallel computer architectures (FIG. 4) to produce variable grained data partitions that can be queried, loaded, or updated in parallel and simultaneous multidimensional index selectivity. And finally, as noted above, this same partitioning method can be used in a nested fashion or in combination with other data structures to provide good performance for the batch maintenance of volatile data and special performance advantages for true time-series data.
329 Citations
6 Claims
-
1. A method of partitioning data regardless of the sparsity of said data on to computer memory or computer storage media with one or more keys or groups of keys from said data such that the selectivity produced by any combination of said keys or said groups of one or more keys can be precisely controlled comprising the steps of:
-
a) determining the number of records from said data to be packed into each unit partition b) determining selectivity of each said key or said group of one or more keys chosen from said composite key so that the combined selectivity of all said keys or said groups of one or more keys is equal to the total number of unit partitions in said data c) arbitrarily choosing a first component key or group of one or more component keys from said chosen component keys in said composite key d) partitioning said data by said first key or group of one or more keys of said chosen component keys possibly followed by one or more of the other component keys in said composite key e) creating a non-dense index formed from said first key or group of one or more keys and possibly followed by one or more of the other component keys in said composite key aligned with the boundaries of said partitions created in step d) f) choosing a new first key or group of one or more keys from said chosen component keys in said composite key g) repeating steps d), e), and f) within each said partition created from the previous iteration of step d) as many times as required to produce said total number of unit partitions contained within said data thereby producing an organization of said data whose said selectivity of combinations of said keys can be precisely controlled by said selectivity for each said key or group of one or more said keys as determined in step b).
-
-
2. A method of partitioning data regardless of the sparsity of said data on to computer memory or computer storage media with one or more keys or groups of keys from said data such that the selectivity of any combination of involved said keys or said groups of one or more keys is symmetric yet the degree of packing of records from said data into unit partitions can be precisely controlled comprising the steps of:
-
a) determining the number of records from said data to be packed into each unit partition b) determining selectivity of each said key or said group of one or more keys chosen from said composite key so that the combined selectivity of all said keys or said groups of one or more keys is equal to the total number of unit partitions in said data c) arbitrarily choosing a first component key or group of one or more component keys from said chosen component keys in said composite key d) partitioning said data by said first key or group of one or more keys of said chosen component keys possibly followed by one or more of the other component keys in said composite key e) creating a non-dense index formed from said first key or group of one or more keys and possibly followed by one or more of the other component keys in said composite key aligned with the boundaries of said partitions created in step d) f) choosing a new first key or group of one or more keys from said chosen component keys in said composite key g) repeating steps d), e), and f) within each said partition created from the previous iteration of step d) as many times as required to produce said total number of unit partitions contained within said data thereby producing an organization of said data whose said selectivity for any combination of said keys or said groups of one or more keys can be made to be symmetric by step b) and whose packing of said records into said unit partitions can be precisely controlled by step a). - View Dependent Claims (3, 4, 5)
-
-
6. A method of partitioning data regardless of the sparsity of said data on to computer memory or computer storage media with one or more keys or groups of keys from said data such that the selectivity of any combination of involved said keys or said groups of one or more keys is symmetric yet a non-dense index which requires no addition computer memory or computer storage media space can be built comprising the steps of:
-
a) determining the number of records from said data to be packed into each unit partition taking into account the extra available space in said unit partitions due to the removal of one or more said component records from one said record in said unit partitions b) determining selectivity of each said key or said group of one or more records chosen from said composite key so that the combined selectivity of all said keys or said groups of one or more keys is equal to the total number of unit partitions in said data c) arbitrarily choosing a first component key or group of one or more component keys from said chosen component keys in said composite key d) partitioning said data by said first key or group of one or more keys of said chosen component keys possibly followed by one or more of the other component keys in said composite key e) creating a non-dense index formed from said first key or group of one or more keys and possibly followed by one or more of the other component keys in said composite key aligned with the boundaries of said partitions created in step d) and removing one or more said component keys from one said record out of each said unit partition corresponding to each entry in said non-dense index f) choosing a new first key or group of one or more keys from said chosen component keys in said composite key g) repeating steps d), e), and f) within each said partition created from the previous iteration of step d) as many times as required to produce said total number of unit partitions contained within said data thereby producing an organization of said data whose said selectivity for any combination of said keys or said groups of one or more keys can be made to be symmetric by st b) and on which a corresponding non-dense index requiring no additional computer memory or computer storage media space can be built.
-
Specification