SYSTEM, METHOD AND DATA STRUCTURE FOR FAST LOADING, STORING AND ACCESS TO HUGE DATA SETS IN REAL TIME
First Claim
2. The computerized system of claim 1, wherein the data structure is created from set of integers using a Super Hierarchiel Bitmaps (SHB) Create function.
2 Assignments
0 Petitions
Accused Products
Abstract
A computerized system including a processor and a computer-readable non-transient memory in communication with the processor, the memory storing instructions that when executed manage a novel data structure and related group of algorithms that can be used as a method for representing a set and as a base for very efficient indexing, hash and compression. SHB is an improvement of hierarchical bitmap. An improved database system that can utilize the innovative data structure which includes a raw data stream provided to the system via a data processing module, data blocks, fields indexes tables and a keys table. There is provided an index creating process and a columns creating process, for transforming the data blocks and tables into index blocks and data columns.
-
Citations
17 Claims
-
2. The computerized system of claim 1, wherein the data structure is created from set of integers using a Super Hierarchiel Bitmaps (SHB) Create function.
-
4. A machine-readable storage device having machine readable instructions tangibly stored thereon which, when executed by the machine, cause the machine to perform a method of database management, the method comprising the steps of:
-
(a) receiving, at a data loader process, a row of values of a raw data record; (b) decomposing, at said loader process, said row of values to separate field values, said separation being dictated by a data scheme; (c) assigning a numerical value key id to each said field value of said row of values; (d) storing in a keys table one unique instance for each said key id; (e) storing sequentially each said key id in a pre-allocated computer readable data memory block, said data block including rows divided into fields according to said data scheme separation, each said row having a record id indicating a position of said row in said data block within said plurality of data blocks; (f) storing field indexes in a field indexes file, each said field index being equivalent to a list of unique instances of each said key id found in given said separate field in a given said data block; (g) converting said field index to a Super Hierarchiel Bitmap (SHB) format; (h) generating inverted indexes, including said field indexes and, for each unique instance of each said key id for each said separate field, a vector of said record ids of said rows in which said key ids equivalent to said field index are stored; and (i) allocating a new said data memory block when a current row id of a said row of said key ids exceeds a preallocated number of rows in said data block. - View Dependent Claims (1, 5, 6, 7, 8, 9, 10, 11)
-
-
11-1. The machine-readable storage device of claim 4, wherein said inverted indexes are created by:
-
(i) reading, sequentially, for each said separate field, said rows of said key ids stored in said data memory block; (ii) providing a vector for each unique instance of a said key id in a said separate field in a said data block, said vector including a list of record ids of each of said rows which hold said key id in said separate field of a said data block.
-
- 12. The machine-readable storage device of claim 11, wherein said reversible function includes adding a predefined constant to said integer raw data value.
-
12-2. A computerized system comprising:
-
a processor; and a computer-readable non-transient memory in communication with the processor, the memory storing instructions that when executed maintain a database management system that includes; (a) a data-processing module configured to receive a plurality of rows of raw data values and decompose said rows of raw data values into separate fields of raw data values, wherein said data-processing module is further configured to assign a numerical value key id to each said raw data value; (b) a plurality of Data Blocks each including a plurality of said key ids sequentially stored in rows according to said separate fields; (c) a plurality of logical record ids, each said record id being for each of said key id rows, said record id having a value equal a sequential position of said row of key ids in said plurality of Data Blocks; (d) a plurality of Data Columns, each Data Column holding a plurality of said key ids of a given said separate field of a said Data Block; (e) a plurality of Inverted Indexes Blocks, each said Inverted Indexes Block having stored a plurality of ordered lists of said plurality of record ids and each said ordered list relating to each unique key id in a said separate field; (f) a keys table, including a list of one unique instance of each said raw data value, wherein a sequential position of said raw data value is equivalent to a said assigned key id of said raw data value; and (g) a plurality of field indexes, each said field index includes a list of unique instances of each said key ids of a said separate field of a said data block.
-
-
13-3. The computerized system of claim 12, wherein each said Data Block is a pre allocated memory block containing a predefined number of said rows of key ids, and each said row of key ids has a predefined size.
Specification