High-performance gridded data storage, arrangement and extraction
First Claim
1. A method of storing and arranging gridded data for high-performance extraction of information from a database, comprising:
- ingesting meteorological data from one or more sources of weather information, and identifying grids relative to a specific geographical location to populate individual grid cells with the meteorological data in a database;
separating the grids into stacks, each stack comprising grid cells corresponding to the specific geographic location and having a coverage area of a length X and a width Y, and each grid cell comprised of a cylinder of a height Z, so that each stack includes X times Y times Z values;
indexing the populated grid cells so that a plurality of values are grouped together in stacks relative to the specific geographic location, in which a payload of the plurality of values that are stored in the cylinders includes the height representing a number of different values in multiple variables, and a spread covering several of the grid cells represented by a footprint having a width and a length covering more than a single grid cell, to permit a read of the plurality of values to extract meteorological data for the specific geographical location;
storing one stack in a database cell with the cylinder height and coverage area dynamically adjusted to optimize storage, so that fetching a single database cell returns a stack of meteorological data that is spatially and temporally related and represents meteorological characteristics stored on a common grid projection for the specific geographical location;
dynamically expiring grids by setting a timer within the database for a stack of grid cells, wherein the timer is set when the values in the stack of grid cells are ingested into the database; and
enforcing constraints to maintain a timeliness of values stored in the database, wherein the constraints include at least one of a latest limit constraint to impose a limit on a time in between receiving grids, a mb-limit constraint to impose a limit on a stack size, a global-mb-limit constraint to impose a limit on total memory available across all stacks processed, a maximum number of grids in a stack, and a maximum number of grids in all stacks processed at any time, andwherein a selection of the height and the spread includes a consideration of physical memory, a consideration of cache advantages of having adjacent grid cells stored in the same database object, and a size of the database object being stored, relative to an extraction of the payload of stored data in the specific geographical location.
4 Assignments
0 Petitions
Accused Products
Abstract
A high-performance gridded database protocol for storing, arranging, and extracting gridded data includes associating values for a single grid cell and storing them together to extract as many useful values as possible from a single read operation. Gridded data is stored in a geographically-indexed cylindrical grid that permits efficient data extraction for a particular location while maximizing efficiency of read operations. Cylinders of values are built by grouping grids that are related to each other so that when data for a location is to be extracted, a minimal number of read operations is needed to retrieve an entire stack of data relevant to the location.
-
Citations
12 Claims
-
1. A method of storing and arranging gridded data for high-performance extraction of information from a database, comprising:
-
ingesting meteorological data from one or more sources of weather information, and identifying grids relative to a specific geographical location to populate individual grid cells with the meteorological data in a database; separating the grids into stacks, each stack comprising grid cells corresponding to the specific geographic location and having a coverage area of a length X and a width Y, and each grid cell comprised of a cylinder of a height Z, so that each stack includes X times Y times Z values; indexing the populated grid cells so that a plurality of values are grouped together in stacks relative to the specific geographic location, in which a payload of the plurality of values that are stored in the cylinders includes the height representing a number of different values in multiple variables, and a spread covering several of the grid cells represented by a footprint having a width and a length covering more than a single grid cell, to permit a read of the plurality of values to extract meteorological data for the specific geographical location; storing one stack in a database cell with the cylinder height and coverage area dynamically adjusted to optimize storage, so that fetching a single database cell returns a stack of meteorological data that is spatially and temporally related and represents meteorological characteristics stored on a common grid projection for the specific geographical location; dynamically expiring grids by setting a timer within the database for a stack of grid cells, wherein the timer is set when the values in the stack of grid cells are ingested into the database; and enforcing constraints to maintain a timeliness of values stored in the database, wherein the constraints include at least one of a latest limit constraint to impose a limit on a time in between receiving grids, a mb-limit constraint to impose a limit on a stack size, a global-mb-limit constraint to impose a limit on total memory available across all stacks processed, a maximum number of grids in a stack, and a maximum number of grids in all stacks processed at any time, and wherein a selection of the height and the spread includes a consideration of physical memory, a consideration of cache advantages of having adjacent grid cells stored in the same database object, and a size of the database object being stored, relative to an extraction of the payload of stored data in the specific geographical location. - View Dependent Claims (2, 3, 4)
-
-
5. A method comprising:
-
populating individual grid cells within a database by collecting and writing meteorological data from one or more sources of weather information in a manner that groups values for a specific geographic location together in related grid cells within a stack by separating the grids into stacks, each stack comprising grid cells corresponding to the specific geographic location and having a coverage area of a length X and a width Y, and each grid cell comprised of a cylinder of a height Z, so that each stack includes X times Y times Z values; arranging a payload of the values into cylinders so that the cylinders have both the height representing a number of different values in multiple variables, and a spread covering several of the grid cells represented by a footprint having a width and a length covering more than a single grid cell, by storing one stack in a database cell with the cylinder height and coverage area dynamically adjusted to optimize storage, so that fetching a single database cell returns a stack of meteorological data that is spatially and temporally related and represents meteorological characteristics stored on a common grid projection for the specific geographical location; extracting gridded data by retrieving a whole stack of grid cells so that only a single read operation is needed to obtain values for the specific geographic location; and dynamically expiring grids by setting a timer within the database for a stack of grid cells, wherein the timer is set when the values in the stack of grid cells are ingested into the database; and enforcing constraints to maintain a timeliness of values stored in the database, wherein the constraints include at least one of a latest limit constraint to impose a limit on a time in between receiving grids, a mb-limit constraint to impose a limit on a stack size, a global-mb-limit constraint to impose a limit on total memory available across all stacks processed, a maximum number of grids in a stack, and a maximum number of grids in all stacks processed at any time, and wherein a selection of the height and the spread includes a consideration of physical memory, a consideration of cache advantages of having adjacent grid cells stored in the same database object, and a size of the database object being stored, relative to an extraction of the payload of stored data in the specific geographical location. - View Dependent Claims (6, 7, 8)
-
-
9. A high-performance gridded database protocol, comprising:
-
in a computing environment that includes a plurality of software and hardware components coupled to at least one processor, the at least one processor configured to carry out one or more program instructions to execute a database protocol having a plurality of operations; a write operation in the plurality of operations, the write operation configured to populate individual grid cells within a database by collecting and writing meteorological data from one or more sources of weather information in a manner that groups values for a specific geographic location together in related grid cells within a stack by separating the grids into stacks, each stack comprising grid cells corresponding to the specific geographic location and having a coverage area of a length X and a width Y, and each grid cell comprised of a cylinder of a height Z, so that each stack includes X times Y times Z values; a store operation in the plurality of operations, the store operation configured to arrange a payload of the values into cylinders so that the cylinders have both the height representing a number of different values multiple variables, and a spread covering several of the grid cells represented by a footprint having a width and a length covering more than a single grid cell, by storing one stack in a database cell with the cylinder height and coverage area dynamically adjusted to optimize storage, so that fetching a single database cell returns a stack of meteorological data that is spatially and temporally related and meteorological characteristics stored on a common grid projection for the specific geographical location, wherein the store operation dynamically expires grids by setting a timer within the database for a stack of grid cells, wherein the timer is set when the values in the stack of grid cells are ingested into the database, and wherein the store operation enforces constraints to maintain a timeliness of values stored in the database, the constraints including at least one of a latest limit constraint to impose a limit on a time in between receiving grids, a mb-limit constraint to impose a limit on a stack size, a global-mb-limit constraint to impose a limit on total memory available across all stacks processed, a maximum number of grids in a stack, and a maximum number of grids in all stacks processed at any time; and a read operation in the plurality of operations, the read operation configured to extract gridded data by retrieving a whole stack of grid cells so that only a single read operation is needed to obtain values for the specific geographic location, and wherein a selection of the height and the spread includes a consideration of physical memory, a consideration of cache advantages of having adjacent grid cells stored in the same database object, and a size of the database object being stored, relative to an extraction of the payload of stored data in the specific geographical location. - View Dependent Claims (10, 11, 12)
-
Specification