Data search employing metric spaces, multigrid indexes, and B-grid trees
First Claim
1. A computer implemented method to search data via a query comprising the acts of:
- creating a multigrid tree taxonomy in metric space for said data using distance functions; and
comparing said query with said multi-grid tree taxonomy to find matches within a given neighborhood of said query.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for generating indexes and fast searching of “approximate”, “fuzzy”, or “homologous” matches for a large quantity of data in a metric space are provided. The data is indexed to generate a search tree taxonomy. Once the index is generated, a query can be provided to report all hits within a certain neighborhood of the query. In an even faster implementation, the invention may be used together with existing approximate sequence comparison algorithms, such as FASTA and BLAST. Here, a local distance of a local metric space is used to generate local search tree branches. Applications of this invention may include homology search for DNA and/or protein sequences, textual or byte-based searches, literature search based on lists of keywords, and vector and matrix based indexing and searching.
129 Citations
29 Claims
-
1. A computer implemented method to search data via a query comprising the acts of:
-
creating a multigrid tree taxonomy in metric space for said data using distance functions; and
comparing said query with said multi-grid tree taxonomy to find matches within a given neighborhood of said query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system to provide efficient data searches comprising:
-
a data store containing data; and
a computing application cooperating with said data store, said computing application having a user interface to accept search queries and processing abilities to process said data of said data store to create a balanced multigrid tree (B-grid tree), said B-grid tree representative of said data of said data store. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A method to provide searchable biological sequencing data comprising the acts of:
-
storing biological sequence data in a easy to distribute format, such that said data is represented by a B-grid tree in metric space; and
distributing said updated biological sequence data having said B-grid tree. - View Dependent Claims (26)
-
-
27. In a computing system having a data store containing data, a method to search data to provide approximate search results comprising the steps of:
-
processing said data by a computing application, said computing application performing a sort of said data using an approximation sequence comparison algorithm; and
creating a B-grid tree for said data by said computing application such that said B-grid tree is representative of said data of said data store. - View Dependent Claims (28, 29)
-
Specification