System and method for evaluating characters in an inputted search string against a character table bank comprising a predetermined number of columns that correspond to a plurality of pre-determined candidate character sets in order to provide enhanced full text search
First Claim
1. A computer implemented method of evaluating characters in an inputted search string to generate a search index, comprising the steps of:
- a) accepting an input of the characters of the search string, wherein the characters can be represented in any of a plurality of character sets corresponding to an undetermined language;
b) evaluating the search string by comparing each of the characters of the search string to a plurality of pre-determined candidate character sets to determine one or more matches between the plurality of pre-determined candidate character sets and the search string; and
c) generating the search index by assigning character sets to a code page, wherein the character sets are assigned based on he results of the evaluation of the charcters of the search string and the plurality of pre-determined candidate character sets that correspond to the characters of the search string.
1 Assignment
0 Petitions
Accused Products
Abstract
An evaluator system accepts input textual messages in unknown languages and assesses which character sets, corresponding to languages, matches that message. Textual messages whose individual characters are encoded in 16 bit Unicode or other universal format are parsed, and character sets which can express each character and the accumulated correspondence is logged. When the character sets against which the message is being tested only provide partial matches, the invention can determine which offers the best fit, including by means of a weighting function. The evaluation technology of the invention can be applied to multipart documents, and to search engines and indices. Documents can be indexed according to assigned character sets, and quary strings matched to indices according to language.
118 Citations
37 Claims
-
1. A computer implemented method of evaluating characters in an inputted search string to generate a search index, comprising the steps of:
-
a) accepting an input of the characters of the search string, wherein the characters can be represented in any of a plurality of character sets corresponding to an undetermined language; b) evaluating the search string by comparing each of the characters of the search string to a plurality of pre-determined candidate character sets to determine one or more matches between the plurality of pre-determined candidate character sets and the search string; and c) generating the search index by assigning character sets to a code page, wherein the character sets are assigned based on he results of the evaluation of the charcters of the search string and the plurality of pre-determined candidate character sets that correspond to the characters of the search string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer implemented system for evaluating characters in an inputted search string to generate a search index, comprising:
-
an input interface to accept an input of the characters of the search string, wherein the characters can be represented in any of a plurality of character sets corresponding to an undetermined language; and a processor unit, connected to the input interface, the processor unit evaluating the search string by comparing each of the characters of the search string to a plurality of pre-determined candidate character sets to determine one or more matches between the plurality of pre-determined candidate character sets and the search string, and generating the search index by assigning character sets to a code page, wherein the character sets are assigned based on the results of the evaluation of the characters of the search string and the plurality of pre-determined candidate character sets that correspond to the characters of the search string. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer implemented system for evaluating characters in an inputted search string to generate a search index, comprising:
-
input interface means to accept an input of the characters of the search string, wherein the characters can be represented in any of a plurality of character sets corresponding to an undetermined language; and processor means, connected to the input interface means, the processor means evaluating the search string by comparing each of the characters of the search string to a plurality of pre-determined candidate character sets to determine one or more matches between the plurality of pre-determined candidate character sets and the search string, and generating the search index by assigning character sets to a code page, wherein the character sets are assigned based on the results of the evaluation of the characters of the search string and the plurality of pre-determined candidate character sets that correspond to the characters of the search string. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
-
25. A computer implemented storage medium for storing machine readable code, the machine readable code being executable to evaluate characters in an inputted electronic search string according to the steps of:
-
a) accepting an input of the characters of the search string, wherein the characters can be represented in any of a plurality of character sets corresponding to an undetermined language; b) evaluating the search string by comparing each of the characters of the search string to a plurality of pre-determined candidate character sets to determine one or more matches between the plurality of pre-determined candidate character sets and the search string; and c) generating the search index by assigning character sets to a code page, wherein the character sets are assigned based on the results of the evaluation of the characters of the search string and the plurality of pre-determined candidate character sets that correspond to the characters of the search string. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32)
-
-
33. A computer implemented method of evaluating characters in an inputted search string to generate a search index, comprising the steps of:
-
a) accepting an input of the characters of the search string, wherein the characters can be represented in any of a plurality of character sets corresponding to an undetermined language; b) evaluating the search string by comparing each of the characters of the search string to a plurality of pre-determined candidate character sets to determine one or more matches between the plurality of pre-determined candidate character sets and the search string, wherein each of the characters of the search string are compared to one or more character sets of a character bank by parsing the characters of the search string and identifying the one or more character sets of the character bank that express each of the characters of the search string, wherein each of the character sets represented in the character bank that correspond to each of the characters of the search string are compared to pre-selected character set indicators of a bit mask to determine a match between each of the character sets represented in the character bank that correspond to the characters of the search string and the characters set indicators of the bit mask, wherein a first column of the character bank corresponds to a first column of the bit mask, and wherein the first column of the character bank and the first column of bit mask correspond to the same character set; and c) generating a search index based on the results of the evaluation of the search string and the plurality of pre-determined candidate character sets.
-
-
34. A computer implemented method of evaluating characters in an inputted search string against a character table bank comprising a predetermined number of columns that correspond to a plurality of pre-determined candidate character sets in order to provide enhanced full text search features, the method comprising:
-
accepting the inputted search string having at least one character; comparing the at least one character of the inputted search string to the plurality of pre-determined candidate character sets; filling the columns of the character table bank to indicate whether or not the at least one character of the string is supported by corresponding pre-determined candidate character sets; creating a bit mask comprising columns equivalent in number to the number of columns in the character table bank; filling the columns of the bit mask to provide an indication of the plurality of pre-determined character sets against which the filled columns of the character table bank are to be matched; evaluating the search string by comparing the filled character table bank against the filled columns of the bit mask; and generating a search index based on the results of the evaluation.
-
-
35. A computer implemented system for evaluating characters in an inputted search string against a character table bank comprising a predetermined number of columns that correspond to a plurality of pre-determined candidate character sets in order to provide enhanced full text search features, the system comprising:
-
an accepting module that accepts the inputted search string having at least one character; a comparing module that compares the at least one character of the inputted search string to the plurality of pre-determined candidate character sets; a bank filling module that fills the columns of the character table bank to indicate whether or not the at least one character of the string is supported by corresponding pre-determined candidate character sets; a creating module that creates a bit mask comprising columns equivalent in number to the number of columns in the character table bank; a mask filling module that fills the columns of the bit mask to provide an indication of the plurality of pre-determined character sets against which the filled columns of the character table bank are to be matched; an evaluating module that evaluates the search string by comparing the filled character table bank against the filled columns of the bit mask; and an index generating module that generates a search index based on the results of the evaluation.
-
-
36. A computer implemented system for evaluating characters in an inputted search string against a character table bank comprising a predetermined number of columns that correspond to a plurality of pre-determined candidate character sets in order to provide enhanced full text search features, the system comprising:
-
accepting means that accepts the inputted search string having at least one character; comparing means that compares the at least one character of the inputted search string to the plurality of pre-determined candidate character sets; bank filling means that fills the columns of the character table bank to indicate whether or not the at least one character of the string is supported by corresponding pre-determined candidate character sets; creating means that creates a bit mask comprising columns equivalent in number to the number of columns in the character table bank; mask filling means that fills the columns of the bit mask to provide an indication of the plurality of pre-determined character sets against which the filled columns of the character table bank are to be matched; evaluating means that evaluates the search string by comparing the filled character table bank against the filled columns of the bit mask; and index generating means that generates a search index based on the results of the evaluation.
-
-
37. A computer implemented storage medium for storing machine readable code, the machine readable code being executable to evaluate characters in an inputted search string against a character table bank comprising a predetermined number of columns that correspond to a plurality of pre-determined candidate character sets in order to provide enhanced full text search features, the method comprising:
-
accepting the inputted search string having at least one character; comparing the at least one character of the inputted search string to the plurality of pre-determined candidate character sets; filling the columns of the character table bank to indicate whether or not the at least one character of the string is supported by corresponding pre-determined candidate character sets; creating a bit mask comprising columns equivalent in number to the number of columns in the character table bank; filling the columns of the bit mask to provide an indication of the plurality of pre-determined character sets against which the filled columns of the character table bank are to be matched; evaluating the search string by comparing the filled character table bank against the filled columns of the bit mask; and generating a search index based on the results of the evaluation.
-
Specification