Method and system for expanding document retrieval information
First Claim
1. An apparatus for expanding a character string, wherein the character string is entered to search image information of documents, the apparatus comprising:
- a character string dividing device to divide the entered character string into a plurality of partial character strings each having a plurality of characters;
a referencing device to reference a similarity table, the similarity table previously storing groups of similar partial character strings, each of the groups of similar partial character strings being derived from each of the plurality of partial character strings obtained from the character string dividing device by changing at least one of the characters of each partial character string to a different character which is similar in shape; and
an expansion device to combine the plurality of similar partial character strings given by the referencing device into expanded words and store them in an expanded word table.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus for expanding a character string, which character string is entered to search includes a character string dividing device to divide the entered character string into partial character strings a referencing device to reference a similarity table, the similarity table storing in advance similar partial character strings, each of similar partial character strings being derived from each of the partial character strings by changing at least one of the characters of each string to a different character which is similar in shape, and an expansion device to combine similar partial character strings into expanded words and store them in table.
-
Citations
20 Claims
-
1. An apparatus for expanding a character string, wherein the character string is entered to search image information of documents, the apparatus comprising:
-
a character string dividing device to divide the entered character string into a plurality of partial character strings each having a plurality of characters;
a referencing device to reference a similarity table, the similarity table previously storing groups of similar partial character strings, each of the groups of similar partial character strings being derived from each of the plurality of partial character strings obtained from the character string dividing device by changing at least one of the characters of each partial character string to a different character which is similar in shape; and
an expansion device to combine the plurality of similar partial character strings given by the referencing device into expanded words and store them in an expanded word table. - View Dependent Claims (2, 3, 4, 5)
-
-
6. In a system for retrieving a document containing a search character string specified by an operator in a search text documents that are produced by performing character recognition processing on image documents, a search character string expanding method comprising:
-
a search character string dividing step of dividing the entered search character string into partial character strings each consisting of a predetermined number n of characters (n≧
2);
a similarity table referencing step of checking the n-character partial character strings (n≧
2) against an n-character-based similarity table, the n-character-based similarity table being generated in advance by storing character strings of similar character shapes that are highly likely to be erroneously recognized; and
a search character string expanding step of extracting groups of similar character strings by checking the partial character strings making up the search character string against the n-character-based similarity table and combining the extracted similar character strings to generate expanded words. - View Dependent Claims (7, 8, 9, 10, 12)
-
-
11. In a system for retrieving a document containing a search character string specified by an operator in a search through text documents that are produced by performing character recognition processing on image documents, a search character string expanding method comprising:
a expansion method switching step of calculating a length of the search character string and selecting between expanded word generation methods according to the search character string length. - View Dependent Claims (13, 14, 15, 17, 18, 19, 20)
-
16. A program read into and running on a computer to expand a character string, wherein the character string is entered to search image information of documents, the program comprising:
-
a character string dividing step of dividing the entered character string into a plurality of partial character strings each having a plurality of characters;
a referencing step of referencing a similarity table, the similarity table previously storing groups of similar partial character strings, each of the groups of similar partial character strings being derived from each of the plurality of partial character strings obtained from the character string dividing step by changing at least one of the characters of each partial character string to a different character which is similar in shape; and
an expansion step of combining the plurality of similar partial character strings given by the referencing step into expanded words and store them in an expanded word table.
-
Specification