Method of storing and retrieving multi-dimensional data using the hilbert curve
First Claim
1. A method of storing multi-dimensional data, or multi-dimensional data points, in a multi-dimensional database and retrieving subsets of said multi-dimensional data according to the specification of a query comprising:
- providing a computer memory which is able to store said multi-dimensional data at a series of addresses in said memory, providing a means of transforming by calculation of multi-dimensional points including said multi-dimensional data into one-dimensional values by mapping said multi-dimensional data to one-dimensional distance values along an approximation of a Hilbert space-filling curve that passes through all multi-dimensional points corresponding to possible data values in the domain of said multi-dimensional database, providing a means of partitioning said multi-dimensional data into an ordered sequence of approximately equal sized subsets of said multi-dimensional data so that each said subset of said multi-dimensional data comprises multi-dimensional data that maps to lower said one-dimensional distance values than all succeeding said subsets of said multi-dimensional data, providing a means of identifying each said subset of said multi-dimensional data together with the location in said computer memory in which each said subset of multi-dimensional data is stored, providing a means of indexing locations in said computer memory of said subsets of said multi-dimensional data, providing a means of specifying a query as a multi-dimensional rectangular region within said domain of said database, providing a means of identifying for retrieval and searching which said subsets of said multi-dimensional data intersect with any said query, whereby said database and its index adapt efficiently to insertion and deletion of said multi-dimensional data, and whereby subsets of multi-dimensional data that intersect with a query region can be identified efficiently for searching and retrieval of multi-dimensional data that lies within the query region.
0 Assignments
0 Petitions
Accused Products
Abstract
An improved method of partitioning and indexing multi-dimensional data that maps the data to one-dimensional values according to the sequence in which an approximation of a Hilbert space-filling curve passes through all of the points corresponding to potential multi-dimensional data in a data space. Data is partitioned into pages, each corresponding to a length of Hilbert curve. A page identifier is the sequence of the first point on its corresponding Hilbert curve section. The mapping orders data and also orders the data pages that contain data within a database.
Mapping multi-dimensional data to one-dimensional values enables the data to be indexed using any one-dimensional index structure.
The practical application of the indexing method is made viable and useful by the provision of a querying algorithm enabling data to be selectively retrieved in response to queries wherein all or some of the data that lies within a rectangular space within multi-dimensional space is required to be retrieved.
The querying algorithm identifies pages whose corresponding curve sections intersect with a query region. The first intersecting page is found by calculating the lowest one-dimensional value corresponding to a possible multi-dimensional data value or point within the query region, and looking up in the index to find which page may contain this point. The next intersecting page, if it exists, is found by calculating the lowest one-dimensional value equal to or greater than the identifier of the next page to the one just identified. This new lowest one-dimensional, if found, is used to look up in the index and find the next page intersecting with the query region. Subsequent pages to be found, if any, are determined in a similar manner until no more are found. Pages found to intersect the query region can be searched for data lying within the query region.
163 Citations
4 Claims
-
1. A method of storing multi-dimensional data, or multi-dimensional data points, in a multi-dimensional database and retrieving subsets of said multi-dimensional data according to the specification of a query comprising:
-
providing a computer memory which is able to store said multi-dimensional data at a series of addresses in said memory, providing a means of transforming by calculation of multi-dimensional points including said multi-dimensional data into one-dimensional values by mapping said multi-dimensional data to one-dimensional distance values along an approximation of a Hilbert space-filling curve that passes through all multi-dimensional points corresponding to possible data values in the domain of said multi-dimensional database, providing a means of partitioning said multi-dimensional data into an ordered sequence of approximately equal sized subsets of said multi-dimensional data so that each said subset of said multi-dimensional data comprises multi-dimensional data that maps to lower said one-dimensional distance values than all succeeding said subsets of said multi-dimensional data, providing a means of identifying each said subset of said multi-dimensional data together with the location in said computer memory in which each said subset of multi-dimensional data is stored, providing a means of indexing locations in said computer memory of said subsets of said multi-dimensional data, providing a means of specifying a query as a multi-dimensional rectangular region within said domain of said database, providing a means of identifying for retrieval and searching which said subsets of said multi-dimensional data intersect with any said query, whereby said database and its index adapt efficiently to insertion and deletion of said multi-dimensional data, and whereby subsets of multi-dimensional data that intersect with a query region can be identified efficiently for searching and retrieval of multi-dimensional data that lies within the query region. - View Dependent Claims (2, 3)
-
-
4. A method of storing and querying multi-dimensional data comprising the steps of,
A method for transforming multi-dimensional data points into one-dimensional values, using a bijective mapping between multi-dimensional points and one-dimensional values determined by the sequence in which an approximation of a Hilbert space-filling curve passes through the multi-dimensional points in the domain of a multi-dimensional database, which are stored in a one-dimensional index structure, comprising the steps of, splitting data into a plurality of subsets, each corresponding to a length of the Hilbert space-filling curve, computing an identifier for each subset selected from the group consisting of the one-dimensional distance value of the start of a section of Hilbert space-filling curve corresponding to the subset of data from the beginning of the Hilbert space-filling curve, the one-dimensional distance value of the end of a section of Hilbert space-filling curve corresponding to the subset of data from the beginning of the Hilbert space-filling curve and a pair of distance values on the section of Hilbert space-filling curve corresponding to the subset of data, wherein one is the lowest and the other is the highest one-dimensional value corresponding actual data points stored in the database and lying on the section of Hilbert space-filling curve, storing each said subset in said index using the subset identifier as an index key. using the index in calculating which subsets of said multi-dimensional intersect with the query, using the index to retrieve candidate subsets of data, searching candidate subsets of data and outputting data points which lie within the query region.
Specification