Transformation-based method for indexing high-dimensional data for nearest neighbour queries
First Claim
1. A method for indexing in a database of stored objects, the method comprising the following steps:
- applying a clustering algorithm to organize high-dimensional points into partitions;
selecting a reference point for each partition;
applying a transformation function to map a high-dimensional point to a one-dimensional space;
indexing the transformed point using a single dimensional index structure.
1 Assignment
0 Petitions
Accused Products
Abstract
We disclose a transformation-based method for indexing high-dimensional data to support similarity search. The method, iDistance, partitions the data into clusters either based on some clustering strategies or simple data space partitioning strategies. The data in each cluster can be described based on their similarity with respect to a reference point, and hence they can be transformed into a single dimensional space based on such relative similarity. This allows us to index the data points using a B+-tree structure and perform similarity search using range search strategy. As such, the method is well suited for integration into existing DBMSs. We also study two data partitioning strategies, and several methods on selection of reference points. We conducted extensive experiments to evaluate iDistance, and our results demonstrate its effectiveness.
43 Citations
10 Claims
-
1. A method for indexing in a database of stored objects, the method comprising the following steps:
-
applying a clustering algorithm to organize high-dimensional points into partitions;
selecting a reference point for each partition;
applying a transformation function to map a high-dimensional point to a one-dimensional space;
indexing the transformed point using a single dimensional index structure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification