Method of indexed storage and retrieval of multidimensional information
DCFirst Claim
1. A computer-implemented method of partitioning data records of a multi-dimensional database into groups comprising:
- defining a function of a distribution of values of a designated variable associated with the data records, the data records being stored in computer processor memory, wherein the function comprises a combination of measures of entropy and adjacency utilized by a computer processor associated with said computer processor memory, adjacency being weighted by a weighting factor andpartitioning the values of the designated variable into two or more groups by the computer processor according to the function.
0 Assignments
Litigations
0 Petitions
Accused Products
Abstract
A tree-structured index to multidimensional data is created using naturally occurring patterns and clusters within the data which permit efficient search and retrieval strategies in a database of DNA profiles. A search engine utilizes hierarchical decomposition of the database by identifying clusters of similar DNA profiles and maps to parallel computer architecture, allowing scale up past previously feasible limits. Key benefits of the new method are logarithmic scale up and parallelization. These benefits are achieved by identification and utilization of naturally occurring patterns and clusters within stored data. The patterns and clusters enable the stored data to be partitioned into subsets of roughly equal size. The method can be applied recursively, resulting in a database tree that is balanced, meaning that all paths or branches through the tree have roughly the same length. The method achieves high performance by exploiting the natural structure of the data in a manner that maintains balanced trees. Implementation of the method maps naturally to parallel computer architectures, allowing scale up to very large databases.
-
Citations
20 Claims
-
1. A computer-implemented method of partitioning data records of a multi-dimensional database into groups comprising:
-
defining a function of a distribution of values of a designated variable associated with the data records, the data records being stored in computer processor memory, wherein the function comprises a combination of measures of entropy and adjacency utilized by a computer processor associated with said computer processor memory, adjacency being weighted by a weighting factor and partitioning the values of the designated variable into two or more groups by the computer processor according to the function. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method of partitioning data records of a multi-dimensional database of a computer processor memory into groups of approximately equal size, comprising the steps of:
-
(a) defining a function of a distribution of values of a designated variable associated with the data records, wherein the function comprises a combination of measures of entropy and adjacency utilized by a first and a second computer processor, adjacency being weighted by a weighting factor; (b) partitioning the values of the designated variable into two or more groups by one of said first and second computer processors, wherein a value of the function is determined by applying an optimization procedure; and (c) assigning a data record to a group in said computer processor memory according to the values of the designated variable. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A parallel data processing system comprising first and second computer processors for implementing a method of partitioning data records of a multi-dimensional database into groups comprising:
-
defining a function of a distribution of values of a designated variable associated with the data records, wherein the function comprises a combination of measures of entropy and adjacency, adjacency being weighted by a weighting factor and partitioning the values of the designated variable into two or more groups for storage in computer processor memory. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification