Approximate string matching optimization for a database
First Claim
1. A computer-implemented method comprising:
- receiving, by one or more processors, a query of a database, wherein the query includes a search value, and wherein the database includes a plurality of datasets;
identifying the search value within the received query;
determining at least one reference value based on the identified search value;
determining, by one or more processors, a distance between the search value and the at least one reference value;
determining, by one or more processors, a maximum distance from the search value to be used in searching the database, wherein the maximum distance from the search value defines a search range and is based, at least in part, on the determined distance between the search value and the at least one reference value;
determining, by one or more processors, a subset of datasets from the plurality of datasets that includes datasets for which a data range with respect to each reference value overlaps with the search range; and
performing, by one or more processors, approximate string matching for the search value on the subset of datasets;
wherein;
each dataset of the plurality of datasets is assigned a minimum distance and a maximum distance between values of dataset entries and the at least one reference value; and
the minimum distance and the maximum distance for each dataset define the data range for the respective dataset with respect to the at least one reference value, and wherein the minimum and maximum distance are permanently stored in a respective dataset to which the minimum and maximum distance are assigned and transferred with the respective dataset when the dataset is copied to a new location or to a new database.
1 Assignment
0 Petitions
Accused Products
Abstract
Software for processing a database query that includes: (i) receiving a query of a database including a search value; (ii) determining a distance between the search value and at least one reference value; (iii) determining a maximum distance from the search value to be used in searching a plurality of datasets of the database, wherein the maximum distance from the search value defines a search range and is based, at least in part, on the determined distance between the search value and the at least one reference value; (iv) determining a subset of datasets from the plurality of datasets that includes datasets for which a data range with respect to each reference value overlaps with the search range; and (v) performing approximate string matching for the search value on the subset of datasets.
-
Citations
15 Claims
-
1. A computer-implemented method comprising:
-
receiving, by one or more processors, a query of a database, wherein the query includes a search value, and wherein the database includes a plurality of datasets; identifying the search value within the received query; determining at least one reference value based on the identified search value; determining, by one or more processors, a distance between the search value and the at least one reference value; determining, by one or more processors, a maximum distance from the search value to be used in searching the database, wherein the maximum distance from the search value defines a search range and is based, at least in part, on the determined distance between the search value and the at least one reference value; determining, by one or more processors, a subset of datasets from the plurality of datasets that includes datasets for which a data range with respect to each reference value overlaps with the search range; and performing, by one or more processors, approximate string matching for the search value on the subset of datasets; wherein; each dataset of the plurality of datasets is assigned a minimum distance and a maximum distance between values of dataset entries and the at least one reference value; and the minimum distance and the maximum distance for each dataset define the data range for the respective dataset with respect to the at least one reference value, and wherein the minimum and maximum distance are permanently stored in a respective dataset to which the minimum and maximum distance are assigned and transferred with the respective dataset when the dataset is copied to a new location or to a new database. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
Specification