Determining a known character string equivalent to a query string
First Claim
Patent Images
1. A method comprising:
- modifying a query string of characters using a set of heuristics;
performing a character-by-character comparison of the modified query string with at least one known string of characters in a corpus in order to locate an exact match for the modified query string; and
responsive to not finding an exact match, performing the following steps in order to locate an equivalent for the modified query string;
forming a plurality of sub-strings of characters from the query string, the sub-strings having varying lengths such that at least two of the formed sub-strings differ in length; and
using an information retrieval technique on the sub-strings formed from the query string to identify a known string of characters equivalent to the query string.
8 Assignments
0 Petitions
Accused Products
Abstract
A system, method, and computer program product perform text equivalencing. The text equivalencing is performed by modifying a string of characters by applying a set of heuristics, comparing the modified strings of characters to known strings of characters. If a match is found, the text equivalencing engine performs database update and exits. If no match is found, sub-strings are formed by grouping together frequently occurring sets of characters. An information retrieval technique is performed on the sub-strings to determine equivalent text.
-
Citations
38 Claims
-
1. A method comprising:
-
modifying a query string of characters using a set of heuristics;
performing a character-by-character comparison of the modified query string with at least one known string of characters in a corpus in order to locate an exact match for the modified query string; and
responsive to not finding an exact match, performing the following steps in order to locate an equivalent for the modified query string;
forming a plurality of sub-strings of characters from the query string, the sub-strings having varying lengths such that at least two of the formed sub-strings differ in length; and
using an information retrieval technique on the sub-strings formed from the query string to identify a known string of characters equivalent to the query string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 34)
-
-
8. A system comprising:
-
a heuristics module for modifying a query string of characters using a set of heuristics;
a comparator module, coupled to the heuristics module, for performing a character-by-character comparison of the modified query string with at least one known string of characters in a corpus in order to find an exact match for the modified query string;
a sub-string formation and information retrieval module for locating an equivalent for the modified query string, responsive to the comparator module not finding an exact match, said sub-string formation module, coupled to the comparator module, for forming a plurality of sub-strings of characters from the query string, the sub-strings having varying lengths such that at least two of the formed sub-strings differ in length; and
said information retrieval module, coupled to the sub-string formation module, for performing an information retrieval technique on the sub-strings formed from the query string to identify a known string of characters equivalent to the query string. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 35, 36)
-
-
19. A computer-readable medium comprising program code to:
-
modify a query string of characters using a set of heuristics;
perform a character-by-character comparison of the modified query string with at least one known string of characters in a corpus in order to locate an exact match for the modified query string; and
responsive to not finding an exact match, locate an equivalent for the modified query string by forming a plurality of sub-strings of characters from the query string, the sub-strings having varying lengths such that at least two of the formed sub-strings differ in length, and using an information retrieval technique on the sub-strings formed from the query string to identify a known string of characters equivalent to the query string. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 37, 38)
-
-
30. A system comprising:
-
heuristics program code for modifying a query string of characters using a set of heuristics;
a comparator for performing a character-by-character comparison of the modified query string with at least one known string of characters in a corpus in order to locate an exact match for the modified query string;
sub-string formation program code and information retrieval program code for locating an equivalent for the modified query string, responsive to the comparator not finding an exact match, said formation program code for forming a plurality of sub-strings of characters from the query string, the sub-strings having varying lengths such that at least two of the formed sub-strings differ in length; and
said information retrieval program code operating on the sub-strings formed from the query string for identifying a known string of characters equivalent to the query string. - View Dependent Claims (31, 32, 33)
-
Specification