Method of using signature subsets for indexing a textual database
First Claim
1. A method of operating a computer system to store and retrieve information in a database, said computer system having data processing means, memory means and data storage means capable of containing data records, said method comprising the steps of:
- (a) storing the database on the data storage means;
(b) during loading, said system automatically creating for the database a signature file which is divided into subsets, mapping a word signature to a particular subset during creation of the file and storing said signature file subsets on said data storage means;
(c) during retrieving, after the signature file is created, entering at least one query word into the system;
(d) said system automatically creating a signature for each query word entered into the system;
(e) scanning for a word signature and retrieving the corresponding data from said database in response to a query word by using the same mapping information that was used to store the word signature in a particular subset, said system matching the signature of a query word with at least one signature in one subset of said signature file if such an appropriate word signature exists in this subset.
0 Assignments
0 Petitions
Accused Products
Abstract
A method of operating a computer system to store and retrieve information in a database uses a signature file of the database that is divided into subsets. A word signature is mapped to a particular subset during creation of the file and the same mapping information is used to retrieve the information in response to a query word. Each word signature is a logical word signature and has two components a physical word signature and a subset designation field. In this way, when information is retrieved from the database, only that subset containing the relevant word signature is scanned. The signature file is automatically created by the system as the database is stored on the data storage modules. During retrieval, the control reviews information received from the data storage means and if a match occurs between a physical word signature for a query word and a particular physical word signature arriving from the input section, the control sends the physical word signature to the FIFO buffer in memory together with the document identifier located subsequent to the matched physical word signature. The control then moves on to process the next physical word signature received from the data storage means. If there is no match, the control ignores the physical word signature and moves on to process the next physical word signature received from the data storage means. The control is effectively capable of processing several query words in parallel.
196 Citations
14 Claims
-
1. A method of operating a computer system to store and retrieve information in a database, said computer system having data processing means, memory means and data storage means capable of containing data records, said method comprising the steps of:
-
(a) storing the database on the data storage means; (b) during loading, said system automatically creating for the database a signature file which is divided into subsets, mapping a word signature to a particular subset during creation of the file and storing said signature file subsets on said data storage means; (c) during retrieving, after the signature file is created, entering at least one query word into the system; (d) said system automatically creating a signature for each query word entered into the system; (e) scanning for a word signature and retrieving the corresponding data from said database in response to a query word by using the same mapping information that was used to store the word signature in a particular subset, said system matching the signature of a query word with at least one signature in one subset of said signature file if such an appropriate word signature exists in this subset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method of operating a computer system for storing and retrieving information in a database, said computer system having a processor, a memory with address lines and a FIFO buffer, said processor having an input section, an output section and a control, said method comprising the steps of:
-
(i) storing the database in the data storage means; (ii) during loading of the database, said system automatically creating for the database a signature file which is divided into subsets, mapping a word signature to a particular subset during creation of the file and storing said signature file subsets on said data storage means; (iii) during retrieving, after the signature file is created, entering at least one query word into the system; (iv) said system automatically creating a signature for each query word entered into the system; (v) in response to a query word, said input section receiving information from said data storage means, said control reviewing said information and sending all word signatures to said address lines of said memory, said memory providing information to the control to enable it to determine whether a particular word signature in the input section matches a signature for said query word if a match occurs, (a) the control sending the word signature to said FIFO buffer, the control remembering that a match has occurred and sending the document identifier located subsequent to the matched word signature to the FIFO buffer, the control then moving on to process the next word signature received from the data storage means and so on; if there is no match, (b) the control ignoring the word signature and moving on to process the next signature word received from the data storage means and so on; the control thereby effectively being capable of processing several query words in parallel.
-
Specification