Method for searching a database system including parallel processors
First Claim
1. A process for searching for relevant documents in a database comprising the steps of:
- (a) forming a database by storing for each of a plurality of documents at least one table of hash codes representing words in the document, the table(s) that represent the words in each different document being stored in a different digital data processor, each hash code comprising information at a plurality of bit locations;
(b) forming a query having at least one word and a point value of relevance assigned to each word;
(c) testing if the word in the query is in the database by;
(1) determining the bit locations in the table at which the hash code corresponding to the queried word is stored; and
(2) simultaneously testing in each of the processors the bit locations corresponding to the queried word;
(d) adding at each digital data processor the point value associated with the queried word to a total point value for the document if the hash code is found at all the bit locations corresponding to the queried word that are tested in that processor; and
(e) providing identification of those documents in the database with high total point values.
6 Assignments
0 Petitions
Accused Products
Abstract
A method to operate on a single instruction multiple data (SIMD) computer for searching for relevant documents in a database which makes it possible to perform thousands of operations in parallel. The words of each document are stored by surrogate coding in tables in one or more of the processors of the SIMD computer. To determine which documents of the database contain a word that is the subject of a query, a query is broadcast from a central computer to all the processors and the query operations are simultaneously performed on the documents stored in each processor. The results of the query are then returned to the central computer. After all the search words have been broadcast to the processors and point values accumulated as appropriate, the point values associated with each document are reported to the central computer. The documents with the largest point values are then ascertained and their identification is provided to the user.
168 Citations
15 Claims
-
1. A process for searching for relevant documents in a database comprising the steps of:
-
(a) forming a database by storing for each of a plurality of documents at least one table of hash codes representing words in the document, the table(s) that represent the words in each different document being stored in a different digital data processor, each hash code comprising information at a plurality of bit locations; (b) forming a query having at least one word and a point value of relevance assigned to each word; (c) testing if the word in the query is in the database by; (1) determining the bit locations in the table at which the hash code corresponding to the queried word is stored; and (2) simultaneously testing in each of the processors the bit locations corresponding to the queried word; (d) adding at each digital data processor the point value associated with the queried word to a total point value for the document if the hash code is found at all the bit locations corresponding to the queried word that are tested in that processor; and (e) providing identification of those documents in the database with high total point values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A process for searching in a database comprising the steps of:
-
(a) forming a database by storing in each of a plurality of digital data processors at least one table of hash codes, the hash codes in each table representing a group of related words; (b) forming a query having at least one word and a point value of relevance assigned to each word; (c) testing if the word in the query is in the database by; (1) determining the bit locations in the table at which the hash code corresponding to the queried word is stored; and (2) simultaneously testing in each of the processors the bit locations corresponding to the queried word; (d) at each digital data processor, adding the point value associated with the queried word to a total point value for that group of related words if the hash code is found at all the bit locations tested in the table; and (e) providing identification of those groups of related words in the database with high total point values. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A process for searching in a database comprising the steps of:
-
(a) forming a database by storing in each of a plurality of digital data processors at least one table of hash codes, the hash codes in each table representing a group of related words; (b) forming a query having at least one word; (c) testing for the presence of the queried word in the database by; (1) determining the bit locations in the table at which the hash code corresponding to the queried word is stored; and (2) simultaneously testing in each of the processors the bit locations corresponding to the queried word; and (d) scoring each group of related words, a score for a group of related words being increased if the hash code is found at all the bit locations tested in the table.
-
Specification