Distributed computer database system and method employing hypertext linkage analysis
First Claim
1. A method for information retrieval in a distributed computer database system having a plurality of home nodes and a plurality of index nodes connected by a network, said method comprising the steps of:
- A) selecting a first one of said plurality of home nodes;
B) parsing, by said selected home node, a query conforming to the said query language, from a user, to obtain a plurality of elementary queries;
C) each of said elementary queries comprising one of an index query or a link query;
D) extracting, by said selected home node, a plurality of features from each elementary query of the said plurality of elementary queries;
E) fragmenting each of said extracted elementary query features into elementary query feature fragments;
F) hashing, by said selected home node, each said elementary query feature fragment of said plurality of elementary query feature fragments, said hashed elementary query feature fragment having a first portion and a second portion;
G) transmitting, by said selected home node, each said hashed elementary query feature fragment of said plurality of elementary query feature fragments to a respective one of said plurality of index nodes indicated by said first portion of each said hashed elementary query feature fragment;
H) using by said index node, said second portion of said respective hashed elementary query feature fragment to access data according to a local hash table located on said index node;
I) returning, by each said index node accessing data according to said respective hashed index query feature fragment a plurality of object identifiers corresponding to said accessed data to said selected home node; and
J) returning, by each said index node accessing data according to said respective hashed link query feature a plurality of pairs of object identifiers corresponding to said accessed data to said selected home node.
1 Assignment
0 Petitions
Accused Products
Abstract
A distributed computer database system includes one or more front end computers, one or more home nodes, one or more index nodes and one or more object nodes interconnected by a network into a search engine for retrieval of hypertext documents. A query from a user is transmitted to one of the front end computers, which forwards the query to one of the home nodes, of the search engine. The home node parses the query into one or more elementary queries and schedules the elementary queries for processing. Each elementary query can be one of a number of types, including an index query, a link query or an object query. To process an index query or link query, the home node extracts features from the index query or link query, fragments the extracted features into feature fragments, and hashes these features. Each hashed feature fragment is transmitted to one index node on the network. Each index node on the network that receives a hashed feature fragment uses the hashed feature fragment of the index query or link query to perform a search on its respective partition of the database. The results of the searches of the local databases are gathered by the home node. To process an object query, the home node transmits the object identifier contained in the object query to the object node on the network containing the information associated with the object. The object node that receives the object query uses the object identifier to perform a search on its respective partition of the database. The results of the search of the local database are transmitted to the home node. The home node processes the results for each elementary query according to the specifications in the query. When all processing is completed by the home node, the results are returned to the front end node, which formats the results for presentation to the user.
151 Citations
30 Claims
-
1. A method for information retrieval in a distributed computer database system having a plurality of home nodes and a plurality of index nodes connected by a network, said method comprising the steps of:
-
A) selecting a first one of said plurality of home nodes;
B) parsing, by said selected home node, a query conforming to the said query language, from a user, to obtain a plurality of elementary queries;
C) each of said elementary queries comprising one of an index query or a link query;
D) extracting, by said selected home node, a plurality of features from each elementary query of the said plurality of elementary queries;
E) fragmenting each of said extracted elementary query features into elementary query feature fragments;
F) hashing, by said selected home node, each said elementary query feature fragment of said plurality of elementary query feature fragments, said hashed elementary query feature fragment having a first portion and a second portion;
G) transmitting, by said selected home node, each said hashed elementary query feature fragment of said plurality of elementary query feature fragments to a respective one of said plurality of index nodes indicated by said first portion of each said hashed elementary query feature fragment;
H) using by said index node, said second portion of said respective hashed elementary query feature fragment to access data according to a local hash table located on said index node;
I) returning, by each said index node accessing data according to said respective hashed index query feature fragment a plurality of object identifiers corresponding to said accessed data to said selected home node; and
J) returning, by each said index node accessing data according to said respective hashed link query feature a plurality of pairs of object identifiers corresponding to said accessed data to said selected home node. - View Dependent Claims (2, 3, 4)
A) determining, by said home node, a measure of similarity between said accessed data and each said elementary query; and
B) using for subsequent processing, by said home node, accessed data having a degree of similarity determined by the said elementary query, subsequent to the step of returning said plurality of object identifiers or said plurality of pairs of object identifiers, according to said respective hashed index feature fragment or hashed link feature fragment.
-
-
4. The method of claim 3 wherein said measure of similarity is determined by a similarity function based on:
- features possessed by both the said accessed data and the said elementary query; and
features possessed only by the said elementary query.
- features possessed by both the said accessed data and the said elementary query; and
-
5. A method of storing objects or locations of objects in a manner which is conducive to information retrieval using a query language in a distributed computer database system having a plurality of home nodes and a plurality of index nodes connected by a network, said method comprising the steps of:
-
A) selecting a first one of said plurality of home nodes;
B) extracting, by said selected home node, a plurality of features from an object submitted by a user;
each of said plurality of features is either an index feature or a link feature;
C) fragmenting each said extracted feature into a plurality of object feature fragments;
D) hashing, by said selected home node, each said object feature fragment of said plurality of object feature fragments, said hashed object feature fragment having a first portion and a second portion;
E) transmitting, by said selected home node, each said hashed object feature fragment of said plurality of feature fragments to a respective one of said plurality of index nodes indicated by said first portion of each said hashed object feature fragment; and
F) using, by said index node, said second portion of said respective hashed object feature fragment to store data according to a local hash table located on said index node, said data including the object identifier of the object containing the feature and, in addition, if the feature is a link feature, the object identifier of the object referenced by the link contained in the link feature. - View Dependent Claims (6)
-
-
7. A distributed computer database system having an information retrieval tool for handling queries from a user comprising:
-
A) a plurality of home nodes and a plurality of index nodes, said plurality of home nodes and said plurality of index nodes connected by a network;
B) wherein each said home node, upon receiving a query from a user, parses the said query to obtain a plurality of elementary queries, each of which is either an index query or a link query, extracts a plurality of features from each said elementary query, fragments each said feature into a plurality of elementary query feature fragments;
hashes each said elementary query feature fragment of said plurality of query feature fragments into a hashed elementary query feature fragment having a first portion and a second portion, and transmits each said hashed query feature fragment to a respective one of said plurality of index nodes indicated by said first portion of said hashed elementary query feature fragment, andC) further wherein each said index node uses said second portion of said hashed query feature fragment to access data according to a local hash table located on said index node and returns a plurality of object identifiers or a plurality of pairs of object identifiers, corresponding to said accessed data to said home node. - View Dependent Claims (8, 9)
-
-
10. A distributed computer database system for storage and retrieval of information objects or locations of information objects, comprising
A) a plurality of home nodes and a plurality of index nodes, said plurality of home nodes and said plurality of index nodes connected by a network; -
B) wherein each said home node, upon receiving an object from a user, extracts a plurality of features from said object, each feature of said plurality of object features being either an index feature or a link feature, fragments each said object feature into a plurality of object feature fragments;
hashes each said object feature fragment of said plurality of object feature fragments into a hashed object feature fragment having a first portion and a second portion, and transmits each said hashed object feature fragment to a respective one of said plurality of index nodes indicated by said first portion of said hashed object feature fragment, andC) further wherein each said index node uses said second portion of said hashed object feature fragment to store object identifiers according to a local hash table located on said index node.
-
-
11. A distributed computer database system having an information retrieval tool for handling queries from a user, comprising:
-
A) a plurality of home nodes, and a plurality of index nodes, said plurality of home nodes and said plurality of index nodes connected by a network;
B) each said home node, upon receiving a command from a user, enqueuing a predetermined task in response to said command, C) a query task enqueued being resultant in, in response to a query command from said user, parsing the said query into a plurality of elementary queries, each of said elementary queries being either an index query or a link query, extracting a plurality of features from each said elementary query of said plurality of elementary queries parsed from the said query contained in said query command, fragmenting each said elementary query feature into a plurality of elementary query feature fragments;
hashing each said elementary query feature fragment of said plurality of elementary query feature fragments into a hashed elementary query feature fragment having a first portion and a second portion, and transmitting an elementary query message containing each said hashed elementary query feature fragment to a respective one of said plurality of index nodes indicated by said first portion of said hashed elementary query feature fragment, andD) said index node, upon receipt of said elementary query message, using said second portion of said hashed elementary query feature to access data according to a local hash table located on said index node and transmitting a message returning a plurality of object identifiers, if the said elementary query feature fragment is an index feature fragment, or returning a plurality of pairs of object identifiers, if the said elementary query feature fragment is a link feature fragment, corresponding to said accessed data to said home node.
-
-
12. A distributed computer database system for storage and retrieval of information, comprising:
-
A) a plurality of home nodes and a plurality of index nodes, said plurality of home nodes and said plurality of index nodes connected by a network;
B) each said home node, upon receiving a command from a user, enqueuing a predetermined task in response to said command, C) an insert task enqueued, in response to an insert command from said user, extracting a plurality of features from an object contained in said insert command, each said feature of said plurality of features being either an index feature or a link feature, fragmenting each said object feature into a plurality of object feature fragments;
hashing each said object feature fragment of said plurality of object feature fragments into a hashed object feature fragment having a first portion and a second portion, and transmitting an insert message containing each said hashed object feature fragment to a respective one of said plurality of index nodes indicated by said first portion of said hashed object feature fragment, andD) said index node, upon receipt of said insert message, using said second portion of said hashed object feature fragment to store data according to a local hash table located on said index node, said data consisting of the object identifier of the object containing the feature and, in addition, if the feature fragment is a link feature fragment, the object identifier of the object referenced by the link contained in the link feature fragment.
-
-
13. A computer program product for processing a query for word based and non-word based retrieval of information from a database, the computer program product comprising a computer-executable program embodied on a computer-readable medium, the computer-executable program comprising:
-
A) a first code portion for extracting a number of features from each of a number of elementary queries comprising a link query;
B) a second code portion for fragmenting each of the features into feature fragments;
C) a third code portion for hashing each of the feature fragments into hashed feature fragments; and
D) a fourth code portion for using each of the feature fragments in accessing a corresponding hash table for obtaining an object identifier therefrom for use in obtaining information from the database relevant to the link queries. - View Dependent Claims (14, 15)
-
-
16. A computerized information retrieval system for processing a query for word based and non-word based retrieval of information from a database, the information retrieval system comprising:
-
A) a first mechanism for extracting a number of features from each of a number of elementary queries, each of the elementary queries comprising a link feature;
B) a second mechanism coupled with the first mechanism for fragmenting each of the features into feature fragments;
C) a third mechanism coupled with the second mechanism for hashing each of the feature fragments into hashed feature fragments; and
D) a fourth mechanism coupled with the third mechanism for using each of the hashed feature fragment in accessing a corresponding hash table for obtaining an object identifier therefrom for use in obtaining information from the database relevant to the link queries. - View Dependent Claims (17, 18)
-
-
19. A method for processing a query for word based and non-word based retrieval of information from a database, the method comprising:
-
A) extracting a number of features from each of a number of elementary queries, each of the elementary queries comprising a link query;
B) fragmenting each of the features into feature fragments;
C) hashing each of the feature fragments into hashed feature fragments; and
D) using each of the feature fragments in accessing a corresponding hash table for obtaining an object identifier therefrom for use in obtaining information from the database relevant to the link queries. - View Dependent Claims (20, 21)
-
-
22. An information indexing system for indexing information for facilitated retrieval from a database, the system comprising:
-
A) a first mechanism for extracting a number of features from an information object, each of the features comprising a link feature;
B) a second mechanism coupled with the first mechanism for fragmenting each of the features into feature fragments;
C) a third mechanism coupled with the second mechanism for hashing each of the feature fragments into hashed feature fragments; and
D) a fourth mechanism coupled with the third mechanism for using each of the feature fragments in accessing a corresponding hash table that identifies a location at which data is to be stored, the data including an object identifier of an object referenced by a link specified by the link feature. - View Dependent Claims (23, 24)
-
-
25. A computer program product for indexing information for facilitated retrieval from a database, the computer program product comprising a computer-executable program embodied on a computer-readable medium, the computer-executable program comprising:
-
A) a first code portion for extracting a number of features from an information object, each of the features comprising a link feature;
B) a second code portion for fragmenting each of the features into feature fragments;
C) a third code portion for hashing each of the feature fragments into hashed feature fragments; and
D) a fourth code portion for using each of the feature fragments in accessing a corresponding hash table that identifies a location at which data is to be stored, the data including an object identifier of an object referenced by a link specified by the link feature. - View Dependent Claims (26, 27)
-
-
28. A method for indexing information for facilitated retrieval from a database, the method comprising:
-
A) extracting a number of features from an information object, each of the features comprising a link feature;
B) fragmenting each of the features into feature fragments;
C) hashing each of the feature fragments into hashed feature fragments; and
D) using each of the feature fragments in accessing a corresponding hash table that identifies a location at which data is to be stored, the data comprising an object identifier of an object referenced by a link specified by the link feature. - View Dependent Claims (29, 30)
-
Specification