System and method for efficiently indexing and storing a large database with high data insertion frequency
First Claim
1. A method of storing and maintaining a database of entries in a computer system having primary random access memory and secondary memory, the steps of the method performed by said computer system comprising:
- upon request, storing new records in a database file;
storing indexed pointers to said new records in a first tree data structure which is stored in primary memory;
storing a second tree data structure in secondary memory, said second tree data structure containing indexed pointers to all records in said database file other than records for which indexed pointers are stored in said first tree data structure;
wherein said indexed pointers in said second tree data structure are stored in sequential order in said secondary memory;
periodically merging a portion of said first tree data structure into said second tree data structure by (1A) selecting a portion of the indexed pointers in said first tree data structure, (1B) retrieving from said secondary memory a corresponding portion of said indexed pointers in said second tree data structure, (1C) merging said selected portion of the indexed pointers in said first tree data structure into said indexed pointers retrieved from said second tree data structure to produce sequentially ordered merged indexed pointers, (1D) storing the merged index pointers in said secondary memory so as to maintain the stored index pointers in sequential order, and (1E) removing from said first tree data structure those indexed pointers which were merged with indexed pointers from said second tree data structure.
4 Assignments
0 Petitions
Accused Products
Abstract
A database index file is maintained by a computer system having primary random access memory and secondary memory. A record for each item added to the database is stored in a sequential file in secondary memory (disk storage) and an indexed pointer to the new record is stored in a small B-tree stored in primary random access memory. The full index file for the database is a second, large B-tree stored in secondary memory. Leaf-nodes of the full index file are stored in indexed order. Periodically, a portion of the memory resident small B-tree is merged with a corresponding portion of the large B-tree by selecting a range of index values and retrieving from secondary memory all indexed pointers in the selected range of index values. The indexed pointers in the first B-tree in the selected range of index values are merged into the retrieved records, the resulting merged set of indexed pointers are stored in secondary memory in indexed order in a contiguous area of secondary memory. As a result, the indexed pointers for newly added database records are written to secondary memory in batches, thereby accessing secondary memory very efficiently.
-
Citations
22 Claims
-
1. A method of storing and maintaining a database of entries in a computer system having primary random access memory and secondary memory, the steps of the method performed by said computer system comprising:
-
upon request, storing new records in a database file; storing indexed pointers to said new records in a first tree data structure which is stored in primary memory; storing a second tree data structure in secondary memory, said second tree data structure containing indexed pointers to all records in said database file other than records for which indexed pointers are stored in said first tree data structure;
wherein said indexed pointers in said second tree data structure are stored in sequential order in said secondary memory;periodically merging a portion of said first tree data structure into said second tree data structure by (1A) selecting a portion of the indexed pointers in said first tree data structure, (1B) retrieving from said secondary memory a corresponding portion of said indexed pointers in said second tree data structure, (1C) merging said selected portion of the indexed pointers in said first tree data structure into said indexed pointers retrieved from said second tree data structure to produce sequentially ordered merged indexed pointers, (1D) storing the merged index pointers in said secondary memory so as to maintain the stored index pointers in sequential order, and (1E) removing from said first tree data structure those indexed pointers which were merged with indexed pointers from said second tree data structure. - View Dependent Claims (2, 3, 4)
-
-
5. A method of storing and maintaining a database of entries in a computer system having primary random access memory and secondary memory,the steps of the method performed by said computer system comprising:
-
upon request, storing new records in a database file; storing indexed pointers to said new records in a first tree data structure which is stored in primary memory; storing a second tree data structure in secondary memory, said second tree data structure containing indexed pointers to all records in said database file other than records for which indexed pointers are stored in said first tree data structure;
wherein said indexed pointers in said second tree data structure are stored in indexed order in said secondary memory;periodically merging a portion of said first tree data structure into said second tree data structure by (1A) selecting a range of index values, (1B) retrieving from said secondary memory all indexed pointers in said second tree data structure in said selected range of index values, (1C) merging into said retrieved indexed pointers those indexed pointers in said first tree data structure matching said selected range of index values to produce merged indexed pointers in indexed order, (1D) storing said merged index pointers in said secondary memory so as to maintain the stored index pointers in indexed order, and (1E) removing from said first tree data structure those indexed pointers which were merged with indexed pointers from said second tree data structure. - View Dependent Claims (6, 7, 8)
-
-
9. A method of storing and maintaining a database of entries in a computer system having primary random access memory and secondary memory, the steps of the method performed by said computer system comprising:
-
storing a set of records in a database file in said secondary memory, each record in said file including a set of fields, wherein a specified set of said fields is designated to be said record'"'"'s index value; storing in primary random access memory a first tree data structure representing indexed pointers to a first subset of said records in said file;
said first tree data structure including a multiplicity of leaf-nodes, each containing a multiplicity of indexed pointers to records in said file;storing in secondary memory a second tree data structure representing indexed pointers a second subset of said records, said second subset of said records comprising those records in said file not included in said first subset of said records;
said second tree data structure containing a multiplicity of leaf-nodes, said leaf-nodes in said second tree data structure containing indexed pointers with non-overlapping sets of index values, wherein said leaf-nodes are stored in said secondary memory in indexed order;responding to requests to store new records by storing each new record in said file and storing an indexed pointer to said new record in said first tree data structure in primary random access memory in accordance with said record'"'"'s index value; and periodically merging a portion of said first tree data structure into said second tree data structure by (1A) selecting a range of index values, (1B) retrieving from said second tree data structure in said secondary memory all indexed pointers in said selected range of index values, (1C) merging into said retrieved indexed pointers those indexed pointers in said first tree data structure matching said selected range of index values to produce merged indexed pointers in indexed order, (1D) storing said merged index pointers in said secondary memory in leaf-nodes in indexed order so as to maintain the index pointers stored in said secondary memory in indexed order, and (1E) removing from said first tree data structure those indexed pointers which were merged with indexed pointers from said second tree data structure. - View Dependent Claims (10, 11, 12)
-
-
13. A database management system, having primary random access memory, secondary memory, and a central processing unit coupled to said primary memory and second secondary memory;
- the database management system comprising;
a database file, containing numerous records, which is at least partially stored in secondary memory; an index for said records in said database file, said index comprising; a first tree data structure which is stored in primary memory, said first tree data structure containing indexed pointers to a subset of the records in said database file; and a second tree data structure which is stored in secondary memory, said second tree data structure containing indexed pointers to all records in said database file other than records for which indexed pointers are stored in said first tree data structure;
wherein said indexed pointers in said second tree data structure are stored in sequential order in secondary memory; anda database management program which is executed by said CPU;
said database management program including;new record storing means for storing new records in said database file and storing indexed pointers to said new records in said first tree data structure; and tree merge means, for periodically merging a portion of said first tree data structure into said second tree data structure by (1A) selecting a range of index values, (1B) retrieving from said secondary memory a corresponding portion of said indexed pointers in said second tree data structure, (1C) merging said selected portion of the indexed pointers in said first tree data structure into said indexed pointers retrieved from said second tree data structure to produce sequentially ordered merged indexed pointers, (1D) storing said merged index pointers in said secondary memory so as to maintain the stored index pointers in sequential order, and (1E) removing from said first tree data structure those indexed pointers which were merged with indexed pointers from said second tree data structure. - View Dependent Claims (14, 15, 16)
- the database management system comprising;
-
17. A database management system, having primary random access memory, secondary memory, and a central processing unit coupled to said primary memory and second secondary memory;
- the database management system comprising;
a database file, containing numerous records, which is at least partially stored in secondary memory; an index for said records in said database file, said index comprising; a first tree data structure which is stored in primary memory, said first tree data structure containing indexed pointers to a subset of the records in said database file; and a second tree data structure which is stored in secondary memory, said second tree data structure containing indexed pointers to all records in said database file other than records for which indexed pointers are stored in said first tree data structure;
wherein said indexed pointers in said second tree data structure are stored in indexed order in secondary memory; anda database management program which is executed by said CPU;
said database management program including;new record storing means for storing new records in said database file and storing indexed pointers to said new records in said first tree data structure; and tree merge means, for periodically merging a portion of said first tree data structure into said second tree data structure by (1A) selecting a range of index values, (1B) retrieving from secondary memory all indexed pointers in said second tree data structure in said selected range of index values, (1C) merging into said retrieved indexed pointers those indexed pointers in said first tree data structure matching said selected range of index values to produce merged indexed pointers in indexed order, (1D) storing said merged index pointers in said secondary memory in indexed order so as to maintain the index pointers stored in said secondary memory in indexed order, and (1E) removing from said first tree data structure those indexed pointers which were merged with indexed pointers from said second tree data structure. - View Dependent Claims (18, 19, 20)
- the database management system comprising;
-
21. A method of storing and maintaining a database of entries in a computer system having primary random access memory and second memory, the steps of the method performed by said computer system comprising:
-
upon request, storing new records in a database file; storing indexed pointers to said new records in a first data structure which is stored in primary memory; storing a second data structure in secondary memory, said second data structure containing indexed pointers to all records in said database file other than records for which indexed pointers are stored in said first data structure;
wherein said indexed pointers in said second data structure are stored in sequential order in said secondary memory;periodically merging a portion of said first data structure into said second data structure by (1A) selecting a portion of the indexed pointers in said first data structure, (1B) retrieving from said secondary memory a corresponding portion of said indexed pointers in said second data structure, (1C) merging said selected portion of the indexed pointers in said first data structure into said indexed pointers retrieved from said second data structure to produce sequentially ordered merged indexed pointers. (1D) storing the merged index pointers in said secondary memory so as to maintain the stored index pointers in sequential order, and (1E) removing from said first data structure those indexed pointers which were merged with indexed pointers from said second data structure.
-
-
22. A database management system, having primary random access memory, secondary memory, and a central processing unit coupled to said primary memory and second secondary memory;
- the database management system comprising;
a database file, containing numerous records, which is at least partially stored in secondary memory; an index for said records in said database file, said index comprising; a first data structure which is stored in primary memory, said first data structure containing indexed pointers to a subset of the records in said database file; and a second data structure which is stored in secondary memory, said second data structure containing indexed pointers to all records in said database file other than records for which indexed pointers are stored in said first data structure;
wherein said indexed pointers in said second data structure are stored in sequential order in secondary memory; anda database management program which is executed by said CPU;
said database management program including;new record storing means for storing new records in said database file and storing indexed pointers to said new records in said first data structure; and merge means, for periodically merging a portion of said first data structure into said second data structure by (1A) selecting a range of index values, (1B) retrieving from said secondary memory a corresponding portion of said indexed pointers in said second data structure, (1C) merging said selected portion of the indexed pointers in said first data structure into said indexed pointers retrieved from said second data structure to produce sequentially ordered merged indexed pointers, (1D) storing said merged index pointers in said secondary memory so as to maintain the stored index pointers in sequential order, and (1E) removing from said first data structure those indexed pointers which were merged with indexed pointers from said second data structure.
- the database management system comprising;
Specification