Parallel data processing system
First Claim
1. A parallel data processing system for search, storage and retrieval of data of a database responsive to a client query for specific data of said database, said parallel data processing system comprising:
- a plurality of host processors including a root host processor, each of said plurality of host processors having at least one search engine;
said root host processor being responsive to a connection request by a client for said specific data of said database to instantiate a thread dedicated to servicing the client and to establish an initial search queue of at least one search request, the at least one search request providing information specifying a client submitting a client query and information specifying a node of a database tree at which a search is to begin and a context of the search request, each search request including a counter of mismatches allowed at a node of the database tree;
each said host processor maintaining a list of available processors and information about the processing capacity and search queue length of search requests for each available processor in memory;
each said host processor for storing at least a portion of said database in an associated memory;
each said host processor decrementing said counter when a miss is allowed at the node of the database tree; and
a communication system coupling said processors, wherein at least two host processors communicate said processing capacity and search queue length information to other host processors, one communicating host processor bringing its search queue into balance with a search queue of another host processor according to a time constant in response to receipt of communicated processing capacity and search queue length information, said bringing into balance comprising exchanging a block of search requests from host processors having long average waiting times to host processors having short waiting times and adjusting a size of a block of exchanged requests according to relative processing speeds of host processors and inter-processor communications protocol between the host processors;
at least one host processor transmitting at least one search request to another host processor, a search engine of said transmitting host processor removing said at least one search request from its search queue and a receiving host processor adding said at least one search request to its search queue;
selected processors storing a database index for said database, the root host processor being responsive to a client query and selecting a host processor to receive one of a search request and a block of search requests of adjusted size.
0 Assignments
0 Petitions
Accused Products
Abstract
A tree-structured index to multidimensional data is created using naturally occurring patterns and clusters within the data which permit efficient search and retrieval strategies in a database of DNA profiles. A search engine utilizes hierarchical decomposition of the database by identifying clusters of similar DNA profiles and maps to parallel computer architecture, allowing scale up past previously feasible limits. Key benefits of the new method are logarithmic scale up and parallelization. These benefits are achieved by identification and utilization of naturally occurring patterns and clusters within stored data. The patterns and clusters enable the stored data to be partitioned into subsets of roughly equal size. The method can be applied recursively, resulting in a database tree that is balanced, meaning that all paths or branches through the tree have roughly the same length. The method achieves high performance by exploiting the natural structure of the data in a manner that maintains balanced trees. Implementation of the method maps naturally to parallel computer architectures, allowing scale up to very large databases.
57 Citations
22 Claims
-
1. A parallel data processing system for search, storage and retrieval of data of a database responsive to a client query for specific data of said database, said parallel data processing system comprising:
-
a plurality of host processors including a root host processor, each of said plurality of host processors having at least one search engine; said root host processor being responsive to a connection request by a client for said specific data of said database to instantiate a thread dedicated to servicing the client and to establish an initial search queue of at least one search request, the at least one search request providing information specifying a client submitting a client query and information specifying a node of a database tree at which a search is to begin and a context of the search request, each search request including a counter of mismatches allowed at a node of the database tree; each said host processor maintaining a list of available processors and information about the processing capacity and search queue length of search requests for each available processor in memory; each said host processor for storing at least a portion of said database in an associated memory; each said host processor decrementing said counter when a miss is allowed at the node of the database tree; and a communication system coupling said processors, wherein at least two host processors communicate said processing capacity and search queue length information to other host processors, one communicating host processor bringing its search queue into balance with a search queue of another host processor according to a time constant in response to receipt of communicated processing capacity and search queue length information, said bringing into balance comprising exchanging a block of search requests from host processors having long average waiting times to host processors having short waiting times and adjusting a size of a block of exchanged requests according to relative processing speeds of host processors and inter-processor communications protocol between the host processors; at least one host processor transmitting at least one search request to another host processor, a search engine of said transmitting host processor removing said at least one search request from its search queue and a receiving host processor adding said at least one search request to its search queue; selected processors storing a database index for said database, the root host processor being responsive to a client query and selecting a host processor to receive one of a search request and a block of search requests of adjusted size. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A parallel data processing system for search, storage and retrieval of data of a database responsive to a client query for specific data of said database, said parallel data processing system comprising:
-
a plurality of host processors including a root host processor, each of said plurality of host processors having at least one search engine; said root host processor being responsive to a connection request by a client for said specific data of said database to instantiate a thread dedicated to servicing the client and to establish an initial search queue of at least one search request, the at least one search request providing the identity of a client submitting a client query and information specifying a node of a database tree at which a search is to begin and a context of the search request, each search request including a counter of mismatches allowed at a node of the database tree; each said host processor maintaining a list of available processors and information about the processing capacity, search queue length of search requests and load for each available processor in memory; each said host processor for storing at least a portion of said database in an associated memory; each said host processor decrementing said counter when a miss is allowed at the node of the database tree; and a communication system coupling said processors, wherein at least two host processors communicate capacity, search queue length and load information to other host processors, one communicating host processor bringing its search queue into balance with a search queue of another host processor according to a time constant in response to receipt of communicated processing capacity and search queue length information, said bringing into balance comprising exchanging a block of search requests from host processors having long average waiting times to host processors having short average waiting times and adjusting a size of a block of exchanged requests according to relative processing speeds of host processors and inter-processor communications protocol between the host processors; at least one host processor transmitting at least one search request to another host processor, a search engine of said transmitting host processor removing said at least one search request from its search queue and a receiving host processor adding said at least one search request to its search queue; selected processors storing a database index for said database, each host processor being responsive to a client query and selecting a host processor to receive one of a search request and a block of search requests of adjusted size and reconfiguring said list of available processors and information about the capacity and load in response to one of addition and failure of a host processor. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A method of communicating at least one search request from a plurality of host processors including a root host processor to at least one host processor of a parallel data processing system for search, storage and retrieval of data of a database responsive to a client query for specific data of said database, each of said plurality of host processors having at least one search engine, said method of communicating the at least one search request in a parallel data processing system comprising:
-
said root host processor, responsive to a connection request by a client for said specific data of said database, instantiating a thread dedicated to servicing the client and establishing an initial search queue of at least one search request, the at least one search request providing information specifying a client submitting the client query, information specifying a node of a database tree at which a search is to begin and specifying a context of the search request, each search request including a counter of mismatches allowed at a node of the database tree; each said host processor maintaining a list of available processors and information about the processing capacity and search queue length of search requests for each available processor in memory; each said host processor for storing at least a portion of said database in an associated memory; each said host processor decrementing said counter when a miss is allowed at the node of the database tree; said root host processor, transmitting said at least one search request to at least one host processor via a communications system; the communication system coupling said processors, wherein at least two host processors communicate said processing capacity and search queue length information to other host processors, one communicating host processor bringing its search queue into balance with a search queue of another host processor according to a time constant in response to receipt of communicated processing capacity and search queue length information, said bringing into balance comprising exchanging a block of search requests from host processors having long average waiting times to host processors having short waiting times and adjusting a size of a block of exchanged requests according to relative processing speeds of host processors and inter-processor communications protocol between the host processors; and at least one host processor transmitting at least one search request to another host processor, a search engine of said transmitting host processor removing said at least one search request from its search queue and a receiving host processor adding said at least one search request to its search queue; selected processors storing a database index for said database, the root host processor being responsive to a client query and selecting a host processor to receive one of a search request and a block of search requests of adjusted size and each said host processor decrementing said counter when a miss is allowed at the node of the database tree. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22)
-
Specification