Querying spatial data in column stores using grid-order scans
First Claim
Patent Images
1. A method implemented on one or more systems comprising one or more programmable processors, the method comprising:
- mapping spatial data stored in a database comprising a columnar data store storing the spatial data in a column-oriented structure, the mapping comprising preserving spatial proximity of a plurality of spatial point objects when the spatial data are physically stored using a grid ordering comprising;
dividing a bounded space containing the plurality of spatial point objects into a grid having fixed boundaries and comprising rectangular cells, andindexing the cells of the grid, wherein the indexing comprises assigning each spatial point object of the plurality of spatial point objects to a particular cell of the grid, assigning one or more bounding boxes to one or more of the cells, maintaining a record of the assignment of the one or more bounding boxes to the one or more of the cells, and creating index vectors to represent the cells of the grid,wherein assigning one or more bounding boxes to one or more of the cells comprises assigning a bounding box to a cell if the bounding box intersects a predefined number of adjacent cells with respect to a single characteristic point of the bounding box, and assigning the bounding box to an overflow cell if the intersection of the bounding box exceeds the predefined cellsize;
receiving a query of the spatial data and checking overflow cell for every received query;
identifying a target bounding box based on the received query of the spatial data, the identifying comprising scanning the cells of the grid using the index vectors and return bit vector to check valid entries;
determining, based on the mapping and the identified target bounding box, a spatial data set corresponding to the received query and a physical storage location in the database from which to retrieve the spatial data set;
retrieving the spatial data set from the physical storage location based on the determining and in response to the received query of the spatial data; and
providing the retrieved spatial data set in response to the received query of the spatial data.
2 Assignments
0 Petitions
Accused Products
Abstract
A query of spatial data is received by a database comprising a columnar data store storing data in a column-oriented structure. Thereafter, a minimal bounding rectangle associated with the query is identified using a grid order scanning technique. The spatial data set corresponding to the received query is then mapped to physical storage in the database using the identified minimal bounding rectangle so that the spatial data set can be retrieved. Related apparatus, systems, techniques and articles are also described.
-
Citations
18 Claims
-
1. A method implemented on one or more systems comprising one or more programmable processors, the method comprising:
-
mapping spatial data stored in a database comprising a columnar data store storing the spatial data in a column-oriented structure, the mapping comprising preserving spatial proximity of a plurality of spatial point objects when the spatial data are physically stored using a grid ordering comprising; dividing a bounded space containing the plurality of spatial point objects into a grid having fixed boundaries and comprising rectangular cells, and indexing the cells of the grid, wherein the indexing comprises assigning each spatial point object of the plurality of spatial point objects to a particular cell of the grid, assigning one or more bounding boxes to one or more of the cells, maintaining a record of the assignment of the one or more bounding boxes to the one or more of the cells, and creating index vectors to represent the cells of the grid, wherein assigning one or more bounding boxes to one or more of the cells comprises assigning a bounding box to a cell if the bounding box intersects a predefined number of adjacent cells with respect to a single characteristic point of the bounding box, and assigning the bounding box to an overflow cell if the intersection of the bounding box exceeds the predefined cellsize; receiving a query of the spatial data and checking overflow cell for every received query; identifying a target bounding box based on the received query of the spatial data, the identifying comprising scanning the cells of the grid using the index vectors and return bit vector to check valid entries; determining, based on the mapping and the identified target bounding box, a spatial data set corresponding to the received query and a physical storage location in the database from which to retrieve the spatial data set; retrieving the spatial data set from the physical storage location based on the determining and in response to the received query of the spatial data; and providing the retrieved spatial data set in response to the received query of the spatial data. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A non-transitory computer program product storing instructions which, when executed by at least one data processor forming part of at least one computing system, result in operations comprising:
-
mapping spatial data stored in a database comprising a columnar data store storing the spatial data in a column-oriented structure, the mapping comprising preserving spatial proximity of a plurality of spatial point objects when the spatial data are physically stored using a grid ordering comprising; dividing a bounded space containing the plurality of spatial point objects into a grid having fixed boundaries and comprising rectangular cells, and indexing the cells of the grid, wherein the indexing comprises assigning each spatial point object of the plurality of spatial point objects to a particular cell of the grid, assigning one or more bounding boxes to one or more of the cells, maintaining a record of the assignment of the one or more bounding boxes to the one or more of the cells, and creating index vectors to represent the cells of the grid, wherein assigning one or more bounding boxes to one or more of the cells comprises assigning a bounding box to a cell if the bounding box intersects a predefined number of adjacent cells with respect to a single characteristic point of the bounding box, and assigning the bounding box to an overflow cell if the intersection of the bounding box exceeds the predefined cellsize; receiving a query of the spatial data and checking overflow cell for every received query; identifying a target bounding box based on the received query of the spatial data, the identifying comprising scanning the cells of the grid using the index vectors and return bit vector to check valid entries; determining, based on the mapping and the identified target bounding box, a spatial data set corresponding to the received query and a physical storage location in the database from which to retrieve the spatial data set; retrieving the spatial data set from the physical storage location based on the determining and in response to the received query of the spatial data; and providing the retrieved spatial data set in response to the received query of the spatial data. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a database comprising a columnar data store storing data in a column-oriented structure; at least one data processor; and memory storing instructions which, when executed by the at least one data processor, result in operations comprising; mapping spatial data stored in a database comprising a columnar data store storing the spatial data in a column-oriented structure, the mapping comprising preserving spatial proximity of a plurality of spatial point objects when the spatial data are physically stored using a grid ordering comprising; dividing a bounded space containing the plurality of spatial point objects into a grid having fixed boundaries and comprising rectangular cells, and indexing the cells of the grid, wherein the indexing comprises assigning each spatial point object of the plurality of spatial point objects to a particular cell of the grid, assigning one or more bounding boxes to one or more of the cells, maintaining a record of the assignment of the one or more bounding boxes to the one or more of the cells, and creating index vectors to represent the cells of the grid, wherein assigning one or more bounding boxes to one or more of the cells comprises assigning a bounding box to a cell if the bounding box intersects a predefined number of adjacent cells with respect to a single characteristic point of the bounding box, and assigning the bounding box to an overflow cell if the intersection of the bounding box exceeds the predefined cellsize; receiving a query of the spatial data and checking overflow cell for every received query; identifying a target bounding box based on the received query of the spatial data, the identifying comprising scanning the cells of the grid using the index vectors and return bit vector to check valid entries; determining, based on the mapping and the identified target bounding box, a spatial data set corresponding to the received query and a physical storage location in the database from which to retrieve the spatial data set; retrieving the spatial data set from the physical storage location based on the determining and in response to the received query of the spatial data; and providing the retrieved spatial data set in response to the received query of the spatial data. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification