Method for performing string matching
First Claim
Patent Images
1. A method for calculating a lower bound estimate of string edit distance between query string and a candidate string, the method comprising:
- equalising lengths of the strings by adding padding elements to a shorter one of the strings;
sorting the query string and the candidate string according to their element values;
calculating a sum of substitution costs of the elements in corresponding positions in the sorted strings, the sum of the substitution costs being the lower bound estimate of string edit distance.
3 Assignments
0 Petitions
Accused Products
Abstract
An improved method of matching a query string against a plurality of candidate strings replaces a highly computationally intensive string edit distance calculation with a less computationally intensive lower bound estimate. The lower bound estimate of the string edit distance between the two strings is calculated by equalising the lengths of the two strings by adding padding elements to the shorter one. The elements of the strings are then sorted and the substitution costs between corresponding elements are summed.
98 Citations
19 Claims
-
1. A method for calculating a lower bound estimate of string edit distance between query string and a candidate string, the method comprising:
-
equalising lengths of the strings by adding padding elements to a shorter one of the strings; sorting the query string and the candidate string according to their element values; calculating a sum of substitution costs of the elements in corresponding positions in the sorted strings, the sum of the substitution costs being the lower bound estimate of string edit distance.
-
-
2. A string matching method comprising steps of:
-
calculating a string edit distance between a query string and one of a plurality of candidate strings and storing said candidate string as a current best match string; calculating a lower bound estimate distance for each remaining candidate string and evaluating each result such that; if a lower bound estimate distance is greater than a current best match distance the candidate string from which the estimate was calculated is discarded; otherwise calculating the string edit distance between the query string and the current candidate string and replacing the current best match string with whichever of the current candidate string and the current best match string has a lower string edit distance from the query string; storing a final best match string. - View Dependent Claims (3, 4)
-
-
5. A scribble matching apparatus in which a query string is matched against a candidate string, the apparatus calculating a lower bound estimate of string edit distance between the strings and comprising:
-
means for equalizing lengths of the strings by adding padding elements to a shorter one of the strings; means for sorting the query string and the candidate string according to a value of their elements; and means for calculating a sum of substitution costs of elements in corresponding positions in the sorted strings, the sum of the substitution costs being the lower bound estimate of string edit distance. - View Dependent Claims (6, 7, 8)
-
-
9. A method for matching a query graphical data sample against one or more candidate graphical data samples comprising steps of:
-
providing each of one or more candidate graphical data samples as a candidate string; providing the query graphical data sample as a query string; and matching the query string against the one or more candidate strings, wherein this matching comprises sorting of the strings according to their internal element values; characterised in that the step of matching the query string against the one or more candidate strings comprises a step of calculating a lower bound estimate of a string edit distance between the query string and at least one of the candidate strings, wherein the step of calculating the lower bound estimate of the string edit distance between the query string and a candidate string comprises; equalising string lengths by adding padding elements to a shorter one of the strings; sorting the strings according to their internal element values; calculating a sum of substitution costs for elements in corresponding positions in the sorted strings, the sum of the substitution costs being the lower bound estimate of the string edit distance. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A scribble matching device, comprising:
-
means to enter a query sample of electronic ink; means to encode an entered query sample of electronic ink as a query string; means to store candidate electronic ink samples as candidate strings; and a matcher adapted for matching the query string against the one or more candidate strings by a method which comprises a step of calculating a lower bound estimate of string edit distance between the query string and at least one of the candidate strings, wherein the step of calculating the lower bound estimate of string edit distance between the query string and a candidate string comprises; equalising the string lengths by adding padding elements to a shorter one of the strings; sorting the strings according to their internal element values; calculating a sum of substitution costs of elements in corresponding positions in the sorted strings, the sum of the substitution costs being the lower bound estimate of the string edit distance. - View Dependent Claims (18, 19)
-
Specification