PARALLEL DATA PROCESSING ARCHITECTURE
First Claim
1. A parallel data processing architecture for search, storage and retrieval of data of a database responsive to client queries for specific data of said database, said parallel data processing architecture comprising:
- a plurality of host processors including a root host processor, said root host processor being responsive to said client queries for specific data of said database;
each of said host and root host processors maintaining a list of available host processors and information about the capacity and load for each available host processor in memory; and
a communication system coupling said host and root host processors, wherein at least two host processors communicate capacity and load information to other host processors;
selected host processors storing a database index for said database comprising nodes of a database tree for said database and data accessible via said nodes of said database tree, the root host processor being responsive to a client query for said specific data of said database and using an initial search queue of at least said client query for said specific data of said database.
0 Assignments
0 Petitions
Accused Products
Abstract
A tree-structured index to multidimensional data is created using naturally occurring patterns and clusters within the data which permit efficient search and retrieval strategies in a database of DNA profiles. A search engine utilizes hierarchical decomposition of the database by identifying clusters of similar DNA profiles and maps to parallel computer architecture, allowing scale up past to previously feasible limits. Key benefits of the new method are logarithmic scale up and parallelization. These benefits are achieved by identification and utilization of naturally occurring patterns and clusters within stored data. The patterns and clusters enable the stored data to be partitioned into subsets of roughly equal size. The method can be applied recursively, resulting in a database tree that is balanced, meaning that all paths or branches through the tree have roughly the same length. The method achieves high performance by exploiting the natural structure of the data in a manner that maintains balanced trees. Implementation of the method maps naturally to parallel computer architectures, allowing scale up to very large databases.
41 Citations
11 Claims
-
1. A parallel data processing architecture for search, storage and retrieval of data of a database responsive to client queries for specific data of said database, said parallel data processing architecture comprising:
-
a plurality of host processors including a root host processor, said root host processor being responsive to said client queries for specific data of said database;
each of said host and root host processors maintaining a list of available host processors and information about the capacity and load for each available host processor in memory; and
a communication system coupling said host and root host processors, wherein at least two host processors communicate capacity and load information to other host processors;
selected host processors storing a database index for said database comprising nodes of a database tree for said database and data accessible via said nodes of said database tree, the root host processor being responsive to a client query for said specific data of said database and using an initial search queue of at least said client query for said specific data of said database. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A parallel data processing architecture for search, storage and retrieval of data of a database responsive to client search queries for specific data of said database, said parallel data processing architecture comprising:
-
a plurality of host processors including a root host processor, said root host processor being responsive to said client search queries for said specific data of said database;
each of said host and root host processors maintaining a list of available host processors and information about the capacity and load for each available host processor in memory; and
a communication system coupling said host and root host processors, wherein at least two host processors communicate capacity and load information to other host processors;
selected host processors storing a database index for said database comprising nodes of a database tree for said database and data accessible via said nodes of said database tree, the root host processor being responsive to a client search query for said specific data of said database and selecting a host processor to receive client search query information. - View Dependent Claims (7, 8, 9)
-
-
10. A parallel data processing architecture for search, storage and retrieval of data of a database responsive to client search queries for specific data of said database, said client search queries being received from a plurality of distributed clients, said parallel data processing architecture comprising:
-
a plurality of host processors including a root host processor, said root host processor being responsive to said client search queries for said specific data of said database, said root host processor creating a search client object for handling each client search query;
each of said host and root host processors maintaining a list of available host processors and information about the capacity and load for each available host processor in memory; and
a communication system coupling said host and root host processors, wherein at least two host processors communicate capacity and load information to other host processors;
selected host processors storing a database index for said database comprising nodes of a database tree for said database and data accessible via said nodes of said database tree, the root host processor being responsive to a client search query for said specific data of said database, selecting a host processor to receive client search query information and using an initial search queue of at least said client search query for said specific data of said database,each host processor maintaining a search queue of said client search queries and broadcasting its capacity and search queue length load information to other host processors and each host processor bringing its search queue into balance according to a time constant with another host processor responsive to receipt of said broadcast capacity and search queue length load information, said balancing including exchanging unprocessed search requests with a recipient host processor responsive to a stochastic selection process, and wherein said client search query for said specific data of said database requests performance of one of storage or retrieval of information and wherein work of said storage or retrieval is distributed among a cooperating group of host processors. - View Dependent Claims (11)
-
Specification