Knowledge extraction system and method
First Claim
1. A method of warehousing objects or locations of objects in a manner which is conducive to knowledge extraction using queries in a distributed computer database system having a number of index nodes and a number of warehousing nodes connected by a network, said method comprising the steps of:
- A) extracting, by a warehousing node, a first number of features from an object downloaded from another database;
B) fragmenting each of the extracted object features into a number of object feature fragments;
C) hashing, by said warehousing node, each of said object feature fragments of said first number of object features, each said hashed object feature fragment having a first portion and a second portion;
D) transmitting, by said warehousing node, each said hashed object feature fragment of said first number of feature fragments to a respective one of said number of index nodes indicated by said first portion of each said hashed object feature;
E) using, by said index node, said second portion of said respective hashed object feature fragment to access data according to a local hash table located on said index node;
F) returning, by each said index node accessing data according to said respective hashed object feature fragment, a number of object identifiers corresponding to said accessed data to said warehousing node;
G) determining, by said warehousing node, whether the said object is to be assigned an object identifier from the said number of object identifiers, or the said object is to be assigned an object identifier that is not yet in use;
H) assigning, by said warehousing node, an object identifier to the said object according to the said determination;
I) extracting, by said warehousing node, a second number of features from said object;
J) fragmenting each of said extracted second number of object features into a number of object feature fragments;
K) hashing, by said warehousing node, each said object feature fragment of said second number of object features, said hashed object feature fragment having a first portion and a second portion;
L) transmitting, by said warehousing node, each said hashed object feature fragment of said second number of feature fragments to a respective one of said number of index nodes indicated by said first portion of each said hashed object feature fragment; and
M) using, by said index node, said second portion of said respective hashed object feature fragment to store data according to a local hash table located on said index node.
1 Assignment
0 Petitions
Accused Products
Abstract
An information retrieval apparatus for processing a query for retrieval of information from a database has a mechanism for locating a number of features and feature fragments in an index database; an evaluating mechanism for identifying a number of sub-queries of a number of levels contained in the query and recursively evaluating the sub-queries using each of the located features and feature fragments; and a mechanism for collecting and storing a number of results of the recursive evaluation of the query and sub-queries pursuant to computing an overall result of the query. Such a system can eliminate the need of conventional retrieval systems for providing a new, separate, centralized replica within the data warehouse of the data stored in the diverse external databases. The invention can thus avoid the problems of replication of such data in conventional systems, in which the data may become stale or is subject to errors arising during replication for warehousing. Instead, the data warehouse can contain an index database, which stores entries providing data regarding the information stored in the external databases, such as information location specifiers for that data within those databases, relational information and statistics. The invention can also provide a robust, versatile indexing system.
-
Citations
17 Claims
-
1. A method of warehousing objects or locations of objects in a manner which is conducive to knowledge extraction using queries in a distributed computer database system having a number of index nodes and a number of warehousing nodes connected by a network, said method comprising the steps of:
-
A) extracting, by a warehousing node, a first number of features from an object downloaded from another database;
B) fragmenting each of the extracted object features into a number of object feature fragments;
C) hashing, by said warehousing node, each of said object feature fragments of said first number of object features, each said hashed object feature fragment having a first portion and a second portion;
D) transmitting, by said warehousing node, each said hashed object feature fragment of said first number of feature fragments to a respective one of said number of index nodes indicated by said first portion of each said hashed object feature;
E) using, by said index node, said second portion of said respective hashed object feature fragment to access data according to a local hash table located on said index node;
F) returning, by each said index node accessing data according to said respective hashed object feature fragment, a number of object identifiers corresponding to said accessed data to said warehousing node;
G) determining, by said warehousing node, whether the said object is to be assigned an object identifier from the said number of object identifiers, or the said object is to be assigned an object identifier that is not yet in use;
H) assigning, by said warehousing node, an object identifier to the said object according to the said determination;
I) extracting, by said warehousing node, a second number of features from said object;
J) fragmenting each of said extracted second number of object features into a number of object feature fragments;
K) hashing, by said warehousing node, each said object feature fragment of said second number of object features, said hashed object feature fragment having a first portion and a second portion;
L) transmitting, by said warehousing node, each said hashed object feature fragment of said second number of feature fragments to a respective one of said number of index nodes indicated by said first portion of each said hashed object feature fragment; and
M) using, by said index node, said second portion of said respective hashed object feature fragment to store data according to a local hash table located on said index node. - View Dependent Claims (2, 3)
-
-
4. A method for data mining using queries in a distributed computer databae system having a number of index nodes connected by a network, said method comprising the steps of:
-
A) selecting a first one of said number of index nodes, herein termed the home node of the query;
B) extracting, by said home node, a number of sub-queries from a query by a user, each said sub-query including a feature, a number of sub-queries and a computation specification;
C) fragmenting each of said sub-query features into a number of sub-query feature fragments;
D) hashing, by said home node, each said sub-query feature fragment of each said sub-query feature fragments, each said hashed sub-query feature fragment having a first portion and a second portion;
E) transmitting, by said home node, each said hashed sub-query feature fragment to a respective one of said number of index nodes indicated by said first portion of each said hashed sub-query feature fragment;
F) using, by said index node, said second portion of said respective hashed sub-query feature fragment to access data according to a local hash table located on said index node;
G) recursively evaluating, by said index node, each sub-query of said number of sub-queries contained in said respective sub-query transmitted by said home node, said index node acting as the home node of said sub-query of said number of sub-queries;
H) computing, by said index node, information according to said computation specification of said respective sub-query transmitted by said home node, according to said accessed data and information determined by said recursive evaluation of each said sub-query of said number of sub-queries contained in said respective sub-query transmitted by said home node; and
I) returning, by each said index node, said information to said home node. - View Dependent Claims (5)
-
-
6. A distributed computer database system for warehousing of information objects or locations of information objects, comprising
A) a number of warehousing nodes and a number of index nodes, said number of warehousing nodes and said number of index nodes connected by a network, B) wherein each said warehousing node, upon downloading an object, extracts a first number of features from said object, fragments each said object feature into an object feature fragment, hashes each said object feature fragment into a hashed object feature fragment having a first portion and a second portion, and transmits each said hashed object feature fragment to a respective one of said number of index nodes indicated by said first portion of said hashed object feature fragment, C) wherein each said index node uses said second portion of said hashed object feature fragment to access data according to a local hash table located on said index node, returning a number of object identifiers corresponding to said accessed data to said warehousing node, D) wherein said warehousing assigns to said object either one of said object identifiers of said number of object identifiers or an object identifier that is not yet in use, extracts a second number of features from said object, fragments each said extracted feature of said second number of features into a number of object feature fragments; - hashes each said object feature fragment of said second number of object features into a hashed object feature having a first portion and a second portion, and transmits each said hashed object feature fragment to a respective one of said number of index nodes indicated by said first portion of said hashed object feature fragment, and
E) wherein each said index node uses said second portion of said hashed object feature fragment to store objects or locations of objects according to a local hash table located on said index node. - View Dependent Claims (7, 8)
- hashes each said object feature fragment of said second number of object features into a hashed object feature having a first portion and a second portion, and transmits each said hashed object feature fragment to a respective one of said number of index nodes indicated by said first portion of said hashed object feature fragment, and
-
9. A distributed computer database system having a data mining tool for handling queries from a user comprising:
-
A) a number of index nodes connected by a network;
B) wherein each said index node, upon receiving a query from a user, and termed the home node of said query, extracts a number of sub-queries from said query and a number of features from each said sub-query, fragments each said sub-query feature into a number of sub-query feature fragments;
hashes the sub-query feature of said number of sub-queries into a hashed sub-query feature having a first portion and a second portion, and transmits each said hashed sub-query feature fragment to a respective one of said number of index nodes indicated by said first portion of said hashed sub-query feature fragment, andC) further wherein each said index node uses said second portion of said hashed sub-query feature fragment to access data according to a local hash table located on said index node, recursively evaluates each sub-query contained in said respective sub-query, computes information according to said accessed data and information determined by said recursive evaluation, and returns said information to said home node.
-
-
10. A distributed computer database system for warehousing and data mining, comprising:
-
A) a number of warehousing nodes, and a number of index nodes, said number of warehousing nodes and said number of index nodes connected by a network, B) each said warehousing node, upon receiving a download command, enqueuing a predetermined task in response to said download command, C) a download task enqueued, in response to a download command, extracting a first number of features from an object downloaded by said download command, fragmenting each said object feature into a number of object feature fragments;
hashing each said object feature fragment of said first number of object features into a hashed object feature fragment having a first portion and a second portion, and transmitting a retrieve message containing each said hashed object feature fragment to a respective one of said number of index nodes indicated by said first portion of said hashed object feature fragment,D) said index node, upon receipt of said retrieve message, using said second portion of said hashed object feature fragment to access data according to a local hash table located on said index node and transmitting a message returning a number of object identifiers corresponding to said accessed data to said warehousing node, E) said warehousing node, upon receipt of said number of object identifiers from said number of index nodes, assigning to said object either one of said object identifiers of said number of object identifiers or an object identifier that is not yet in use, extracting a second number of features from said object, fragementing each said object feature of said second number of object features into a number of object feature fragments;
hashing each said object feature fragment of said second number of object feature fragments into a hashed object feature fragment having a first portion and a second portion, and transmitting an insert message containing each said hashed object feature fragment to a respective one of said number of index nodes indicated by said first portion of said hashed object feature fragment, andF) said index node, upon receipt of said insert message, using said second portion of said hashed object feature fragment to store data according to a local hash table located on said index node. - View Dependent Claims (11, 12)
-
-
13. A distributed computer database system having a data mining tool for handling queries from a user, comprising;
-
A) a number of index nodes connected by a network, B) each said index node, upon receiving a command from a user, said index node termed the home node of the command, enqueuing a predetermined task in response to said command, C) a query task enqueued being resultant in, in response to a query command from said user, extracting a number of sub-queries from a query contained in said query command and a number of features from each said extracted sub-query, fragmenting each said sub-query feature into a number of sub-query feature fragments;
hashing each said sub-query feature fragment into a hashed sub-query feature fragment having a first portion and a second portion, and transmitting a sub-query message containing each said hashed sub-query feature fragment to a respective one of said number of index nodes indicated by said first portion of said hashed sub-query feature fragment, andD) said index-node, upon receipt of said sub-query message, using said second portion of said hashed sub-query feature fragment to access data according to a local hash table located on said index node recursively evaluating each sub-query contained in said respective sub-query, computing information according to said accessed data and information determined by said recursive evaluation, and transmitting a message returning said information to said home node. - View Dependent Claims (14)
-
-
15. An information retrieval apparatus for processing a query for retrieval of information from a database, comprising:
-
A) a mechanism for locating a number of features and feature fragments in an index;
B) an evaluating mechanism coupled with the locating mechanism identifying a number of sub-queries of a number of levels contained in the query and recursively evaluating the sub-queries using each of the located features and feature fragments; and
C) a mechanism coupled with the evaluating mechanism for collecting and storing in a memory a number of results of the recursive evaluation of the query and sub-queries pursuant to computing an overall result of the query.
-
-
16. An method for processing a query for retrieval of information from a database, comprising:
-
A) locating a number of features and feature fragments in an index;
B) identifying a number of sub-queries of a number of levels contained in the query and recursively evaluating the sub-queries using each of the located features and feature fragments; and
C) collecting and storing a number of results of the recursive evaluation of the query and sub-queries pursuant to computing an overall result of the query.
-
-
17. A computer program product for processing a query for retrieval of information from a database, the computer program product comprising a computer-executable program embodied on a computer-readable medium, the computer-executable program comprising:
-
A) first code portion for locating a number of features and feature fragments in an index;
B) second code portion for identifying a number of sub-queries of a number of levels contained in the query and recursively evaluating the sub-queries using each of the located features and feature fragments; and
C) a third code portion for collecting and storing a number of results of the recursive evaluation of the query and sub-queries pursuant to computing an overall result of the query.
-
Specification