Prioritized merging for full-text index on relational store
First Claim
Patent Images
1. A method of full-text searching of a database, the method comprising:
- providing data to be indexed for the full-text searching;
creating inverted lists in memory from a keyword memory by database management threads, the keyword memory accessed by a word breaker external to a database management system and by a database management system thread, wherein each inverted list comprises at least one of a plurality of keywords in the data, an identifier associated with the data and at least one occurrence of the at least one keyword in the data;
generating instances of an index based on the data, the index comprising part of a database indexing system of the database management system, the instances of the index generated from the inverted lists;
storing the instances of the index in a priority queue, wherein each instance is assigned a priority based on a number of keywords of the instance and a size of the instance;
scheduling a merge to be run on the instances of the index based on query load and merge load;
selecting instances to be merged based on the assigned priorities of the instances; and
merging the selected instances based on a selected type of merge to generate an instance of the index.
2 Assignments
0 Petitions
Accused Products
Abstract
A full-text search index system and method is generated by creating instances of a database index from an in-memory inverted list of keywords associated with a text identifier and the occurrences of the keyword in the text. Instances of the index are placed in a priority queue. A merge scheduling process determines when a merge should be initiated, selects instances of the index to be merged and selects a type of merge to perform.
-
Citations
13 Claims
-
1. A method of full-text searching of a database, the method comprising:
-
providing data to be indexed for the full-text searching; creating inverted lists in memory from a keyword memory by database management threads, the keyword memory accessed by a word breaker external to a database management system and by a database management system thread, wherein each inverted list comprises at least one of a plurality of keywords in the data, an identifier associated with the data and at least one occurrence of the at least one keyword in the data; generating instances of an index based on the data, the index comprising part of a database indexing system of the database management system, the instances of the index generated from the inverted lists; storing the instances of the index in a priority queue, wherein each instance is assigned a priority based on a number of keywords of the instance and a size of the instance; scheduling a merge to be run on the instances of the index based on query load and merge load; selecting instances to be merged based on the assigned priorities of the instances; and merging the selected instances based on a selected type of merge to generate an instance of the index. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for generating a full text search index integrated within a database management system comprising:
-
a processing thread pool comprising at least one of a plurality of processing threads of the database management system, the processing threads creating instances of the full text search index from inverted lists; a memory for storing the inverted lists, the memory accessed by a word breaker external to the database management system and the database management system processing threads; each inverted list comprising at least one of a plurality of keywords contained in data, an identifier associated with the data and at least one occurrence of the at least one keyword in the data; a priority queue for storing and prioritizing the instances of the full text search index, wherein each instance is assigned a priority based on a number of rows of the instance and a size of the instance; a merging thread pool comprising at least one of a plurality of merge threads, the at least one merge thread determining when a merge is scheduled, selecting, based on the assigned priorities of the instances, a plurality of instances from the priority queue, determining a type of merge to perform on the selected instances, and merging the selected instances based on the determined type of merge to generate an instance of the index. - View Dependent Claims (12)
-
-
13. A computer-readable medium for full-text searching of a database comprising computer-executable instructions for:
-
providing data to be indexed for the full-text searching; creating inverted lists in memory from a keyword memory, the keyword memory accessed by a word breaker external to a database management system and by a database management system thread, each inverted list comprising at least one of a plurality of keywords contained in the data, an identifier associated with the data and at least one occurrence of the at least one keyword in the data; creating instances of an index based on the inverted lists, the index comprising part of a database indexing system of the database management system; placing the instances of the index in a priority queue for processing, wherein each instance is assigned a priority based on a number of rows of the instance and a size of the instance; selecting for merge a plurality of instances of the index from the priority queue based on the assigned priorities of the instances; scheduling a merge to be run on the selected instances based on consideration of query load and merge load; and merging the selected instances based on a selected type of merge to generate an instance of the index.
-
Specification