Method and arrangement for searching for strings
First Claim
1. A method of searching for a final number of result strings, having a partial or an exact match with a query string, in a database comprised of many long strings or a long string, said method comprising the steps of:
- partitioning the query string in a first number of input query strings;
determining a second number of neighboring strings for each string in said first number of input query strings, wherein each string in said second number of neighboring strings has a predetermined first number of errors;
calculating the predetermined first number of errors and comparing it to a threshold;
determining whether the predetermined number of errors is equal to or larger than the threshold for determining whether to keep or discard the second number of neighboring strings;
searching the database for a third number of exact matches for each string in said second number of neighboring strings, which are not discarded, based on a search method;
concatenating said searched exact matched strings from the database into a fourth number of intermediate strings wherein said searched exact matched strings comprised in each of said intermediate strings are in corresponding succession to the partitioned input query strings in said database; and
determining the final number of result strings based on said fourth number of intermediate strings, wherein each string in the final number of result strings has a maximum of a predetermined second number of errors compared to said query string.
1 Assignment
0 Petitions
Accused Products
Abstract
This invention relates to methods of searching for a final number of result strings (30-33) having a partial or an exact match with a query string (34) in a database (80) comprised of many long strings or a long string, said method includes the steps of partitioning the query string in a first number of input query strings (35, 36, 37); determining a second number of neighboring strings (38-41, 42-45, 44-49, respectively) for each string in said first number of input query strings, wherein each string in said second number of neighboring strings has a predetermined first number of errors; searching the database for a third number of exact matches (50-61, 70-74) for each string in said second number of neighboring strings based on a search method; concatenating said searched exact matched strings from the database into a fourth number of intermediate strings (29, 30, 32, 33, 34) wherein said searched exact matched strings (50-61, 70-74) comprised in each of said intermediate strings are in succession to one another in said database; and determining the final number of result strings (30-33) based in said fourth number of intermediate strings, wherein each string in the final number of result strings has a maximum of predetermined second number of errors compared to said query string (34). This enables for a perfect match or a partial match containing only minor errors with respect to said query string, and for a fast search in larger databases with a relative low use of processing power.
-
Citations
15 Claims
-
1. A method of searching for a final number of result strings, having a partial or an exact match with a query string, in a database comprised of many long strings or a long string, said method comprising the steps of:
- partitioning the query string in a first number of input query strings;
determining a second number of neighboring strings for each string in said first number of input query strings, wherein each string in said second number of neighboring strings has a predetermined first number of errors;
calculating the predetermined first number of errors and comparing it to a threshold;
determining whether the predetermined number of errors is equal to or larger than the threshold for determining whether to keep or discard the second number of neighboring strings;
searching the database for a third number of exact matches for each string in said second number of neighboring strings, which are not discarded, based on a search method;
concatenating said searched exact matched strings from the database into a fourth number of intermediate strings wherein said searched exact matched strings comprised in each of said intermediate strings are in corresponding succession to the partitioned input query strings in said database; and
determining the final number of result strings based on said fourth number of intermediate strings, wherein each string in the final number of result strings has a maximum of a predetermined second number of errors compared to said query string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- partitioning the query string in a first number of input query strings;
-
11. A search engine configured for execution by a processor, said search engine comprising calculation means for carrying out the steps of a method of searching for a final number of result strings, having a partial or an exact match with a query string, in a database comprised of many long strings or a long string, said method comprising the steps of:
- partitioning the query string in a first number of input query strings;
determining a second number of neighboring strings for each string in said first number of input query strings, wherein each string in said second number of neighboring strings has a predetermined first number of errors;
calculating the predetermined first number of errors and comparing it to a threshold;
determining whether the predetermined number of errors is equal to or larger than the threshold for determining whether to keep or discard the second number of neighboring strings;
searching the database for a third number of exact matches for each string in said second number of neighboring strings, which are not discarded, based on a search method;
concatenating said searched exact matched strings from the database into a fourth number of intermediate strings wherein said searched exact matched strings comprised in each of said intermediate strings are in corresponding succession to the partitioned input query strings in said database; and
determining the final number of result strings based on said fourth number of intermediate strings, wherein each string in the final number of result strings has a maximum of a predetermined second number of errors compared to said query string.
- partitioning the query string in a first number of input query strings;
-
12. A tool comprising a processor;
- and means for carrying out the steps of a method by the processor, the steps comprising searching for a final number of result strings, having a partial or an exact match with a query string, in a database comprised of many long strings or a long string partitioning the query string in a first number of input query strings;
determining a second number of neighboring strings for each string in said first number of input query strings, wherein each string in said second number of neighboring strings has a predetermined first number of errors;
calculating the predetermined first number of errors and comparing it to a threshold;
determining whether the predetermined number of errors is equal to or larger than the threshold for determining whether to keen or discard the second number of neighboring strings;
searching the database for a third number of exact matches for each string in said second number of neighboring strings, which are not discarded, based on a search method;
concatenating said searched exact matched strings from the database into a fourth number of intermediate strings wherein said searched exact matched strings comprised in each of said intermediate strings are in corresponding succession to the partitioned input query strings in said database; and
determining the final number of result strings based on said fourth number of intermediate strings, wherein each string in the final number of result strings has a maximum of a predetermined second number of errors compared to said query string.
- and means for carrying out the steps of a method by the processor, the steps comprising searching for a final number of result strings, having a partial or an exact match with a query string, in a database comprised of many long strings or a long string partitioning the query string in a first number of input query strings;
-
13. A processing arrangement comprising:
- a processor comprising;
calculation means for partitioning the query string in a first number of input query strings;
calculation means for determining a second number of neighboring strings for each string in said first number of input query strings, wherein each string in said second number of neighboring strings has a predetermined first number of errors;
calculation means for determined the predetermined first number of errors;
comparison means for comparing the predetermined first number of errors to a threshold;
determination means for determining whether the predetermined number of errors is equal to or larger than the threshold and for determining whether to keen or discard the second number of neighboring strings;
calculation means for searching the database for a third number of exact matches for each string in said second number of neighboring strings, which are not discarded, based on a search method;
calculation means for concatenating said searched exact matched strings from the database into a fourth number of intermediate strings wherein said searched exact matched strings comprised in each of said intermediate strings are in corresponding succession to the partitioned input query strings in said database; and
calculation means for determining the final number of result strings based on said fourth number of intermediate strings, wherein each string in the final number of result strings has a maximum of a predetermined second number of errors compared to said query string.
- a processor comprising;
-
14. A computer system having a processor for performing a method of searching for a final number of result strings, having a partial or an exact match with a query string, in a database comprised of many long strings or a long string, said processor performing the steps of the method which comprise:
- partitioning the query string in a first number of input query strings;
determining a second number of neighboring strings for each string in said first number of input query strings, wherein each string in said second number of neighboring strings has a predetermined first number of errors;
calculating the predetermined first number of errors and comparing it to a threshold;
determining whether the predetermined number of errors is equal to or larger than the threshold for determining whether to keen or discard the second number of neighboring strings;
searching the database for a third number of exact matches for each string in said second number of neighboring strings, which are not discarded, based on a search method;
concatenating said searched exact matched strings from the database into a fourth number of intermediate strings wherein said searched exact matched strings comprised in each of said intermediate strings are in corresponding succession to the partitioned input query strings in said database; and
determining the final number of result strings based on said fourth number of intermediate strings, wherein each string in the final number of result strings has a maximum of a predetermined second number of errors compared to said query string.
- partitioning the query string in a first number of input query strings;
-
15. A computer program product comprising program code means stored on a computer readable having instructions, that when executed by a processor, result in execution of a method of searching for a final number of result strings, having a partial or an exact match with a query string, in a database comprised of many long strings or a long string, said method comprising the steps of:
- partitioning the query string in a first number of input query strings;
determining a second number of neighboring strings for each string in said first number of input query strings, wherein each string in said second number of neighboring strings has a predetermined first number of errors;
calculating the predetermined first number of errors and comparing it to a threshold;
determining whether the predetermined number of errors is equal to or larger than the threshold for determining whether to keen or discard the second number of neighboring strings;
searching the database for a third number of exact matches for each string in said second number of neighboring strings, which are not discarded, based on a search method;
concatenating said searched exact matched strings from the database into a fourth number of intermediate strings wherein said searched exact matched strings comprised in each of said intermediate strings are in corresponding succession to the partitioned input query strings in said database; and
determining the final number of result strings based on said fourth number of intermediate strings, wherein each string in the final number of result strings has a maximum of a predetermined second number of errors compared to said query string.
- partitioning the query string in a first number of input query strings;
Specification