Information processing method and apparatus, and storage medium storing medium storing program for practicing this method
First Claim
1. An information processing method of determining whether a designated character train is included in text information, said method comprising the steps of:
- forming a pattern including a first number of characters in which at least one character is deleted from the designated character train;
dividing the pattern into a plurality of test character trains each including a predetermined second number of characters of the pattern without any other characters, the second number being less than the first number;
performing a first determination that determines whether the test character trains are included in the text information;
performing a second determination that determines whether the first number of characters are included in the text information for which the first determination has determined that the test character trains are included; and
controlling output of the text information, for which the second determination has determined that the first number of characters are included, as a search operation result.
1 Assignment
0 Petitions
Accused Products
Abstract
When a result obtained upon character recognition of an input image is to be used as text data for a search operation, a proper search operation can be performed even if a character different from the actual character image is stored as text data due to a character extraction error in character recognition processing. An information processing apparatus includes an image scanner for inputting image information, OCR software for recognizing the input image, a text information storage section for storing a recognition result, and document search software for assuming addition of an extra character in a designated search word, forming a pattern obtained by deleting a character from the search word, and searching a document using this pattern.
77 Citations
38 Claims
-
1. An information processing method of determining whether a designated character train is included in text information, said method comprising the steps of:
-
forming a pattern including a first number of characters in which at least one character is deleted from the designated character train;
dividing the pattern into a plurality of test character trains each including a predetermined second number of characters of the pattern without any other characters, the second number being less than the first number;
performing a first determination that determines whether the test character trains are included in the text information;
performing a second determination that determines whether the first number of characters are included in the text information for which the first determination has determined that the test character trains are included; and
controlling output of the text information, for which the second determination has determined that the first number of characters are included, as a search operation result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
said forming step comprises forming a plurality of types of patterns such that, for each of the plurality of types of patterns, one character is deleted from the designated character train, and said step of performing the first determination comprises performing a determination using the plurality of types of patterns. -
3. A method according to claim 1, wherein the character deleted from the designated character train is defined as a character corresponding to a predetermined character.
-
4. A method according to claim 1, wherein the character deleted from the designated character train is defined as a character not corresponding to a predetermined character.
-
5. A method according to claim 1, wherein the character deleted from the designated character train is determined by a prestored table.
-
6. A method according to claim 1, wherein the text information is a result of character recognition of input image information.
-
7. A method according to claim 1, wherein the second determination step allows an addition of arbitrary characters numbering less than a predetermined number at a position where the at least one character was deleted from.
-
8. A method according to claim 1, wherein the first determination is executed by determining whether each character of each test character train is identical to that of the text information.
-
9. A method according to claim 1, further comprising the step of outputting text information determined to include the designated character train.
-
10. A method according to claim 6, further comprising the step of outputting image information corresponding to text information determined to include the designated character train.
-
11. A method according to claim 1, wherein the second determination step allows an addition of arbitrary characters numbering less than a predetermined number at any position without reference to a position where the at least one character was deleted from.
-
-
12. A computer-readable storage medium storing a control program for performing a determination to determine whether a designated character train is included in text information, said program comprising:
-
program code for a formation step of forming a pattern including a first number of characters in which at least one character is deleted from the designated character train;
program code for a division step of dividing the pattern into a plurality of test character trains each including a predetermined second number of characters of the pattern without any other characters, the second number being less than the first number;
program code for a first determination step of performing a first determination that determines whether the test character trains are included in the text information;
program code for a second determination step of performing a second determination that determines whether the first number of characters are included in the text information for which the first determination step has determined that the test character trains are included; and
program code for a control step of controlling output of the text information, for which the second determination step has determined that the first number of characters are included, as a search operation result. - View Dependent Claims (13, 14, 15, 36, 37, 38)
-
-
16. An information processing method of determining whether a designated character train is included in text information, said method comprising the steps of:
-
forming a pattern including a first number of characters in which at least one character is deleted from the designated character train;
dividing the pattern into a plurality of test character trains each including a predetermined second number of characters of the pattern without any other characters, the second number being less than the first number;
performing a first determination that determines whether the test character trains are included in the text information under an assumption that another character is inserted at a position of character deletion in the formed pattern;
performing a second determination that determines whether the first number of characters are included in the text information for which the first determination has determined that the test character trains are included; and
controlling output of the text information, for which the second determination has determined that the first number of characters are included, as a search operation result. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
-
-
24. An information processing apparatus comprising:
-
a text information storage memory for storing text information;
a first determination unit for determining whether a designated character train is included in the text information, wherein said first determination unit determines whether the test character trains are included in the text information;
a pattern forming unit for forming a pattern including a first number of characters in which at least one character is deleted from the designated character train;
a second determination unit for determining whether the first number of characters is included in the text information for which said first determining unit has determined to include the test character trains; and
a controller for outputting the text information, for which said second determination unit has determined to include the first numbers of characters, as a search operation result. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
said pattern forming unit forms a plurality of types of patterns such that, for each of the plurality of types of patterns, one character is deleted from the designated character train, and said first determination unit performs a determination using the plurality of types of patterns. -
26. An apparatus according to claim 24, wherein the character deleted from the designated character train by said pattern forming unit is defined as a character corresponding to a predetermined character.
-
27. An apparatus according to claim 24, wherein the character deleted from the designated character train by said pattern forming unit is defined as a character not corresponding to a predetermined character.
-
28. An apparatus according to claim 24, wherein the character deleted from the designated character train by said pattern forming unit is determined by a prestored table.
-
29. An apparatus according to claim 24, wherein the text information is a result of character recognition of input image information.
-
30. An apparatus according to claim 24, further comprising a pattern divider for dividing the pattern into character trains each including a predetermined number of characters,
wherein said first determination unit performs a determination in accordance with whether the divided character trains are included in the text information. -
31. An apparatus according to claim 24, wherein said first determination unit determines whether each character of the character train is identical to that of the text information.
-
32. An apparatus according to claim 24, further comprising a text information output unit for outputting text information determined by said first determination unit to include the designated character train.
-
33. An apparatus according to claim 29, further comprising an image information output unit for outputting image information corresponding to text information determined by said first determination unit to include the designated character train.
-
34. An apparatus according to claim 24, wherein said second determination unit allows an addition of arbitrary characters numbering less than a predetermined number at a position where the at least one character was deleted from.
-
35. An apparatus according to claim 24, wherein said second determination unit allows an addition of arbitrary characters numbering less than a predetermined number at any position without reference to where the at least one character was deleted from.
-
Specification