Method and apparatus for determining form sheet type
First Claim
1. A form sheet type determining method comprising the steps of:
- extracting each character string on an input form sheet as a keyword, after performing character recognition on the each character string; and
collating the extracted keywords with a plurality of sets of keywords registered beforehand for each predetermined form sheet as one set of keywords in a keyword register, thereby to determine the type of said input form sheet,wherein each keyword in each set of keywords registered beforehand is registered in said keyword register in association with a predetermined corresponding weight, andwherein in said step of collating, each of said extracted keywords of said input form sheet is given a weight;
the degree of matching between said input form sheet and said predetermined form sheet types is evaluated for each predetermined form sheet type by using said weights of said extracted keywords and said predetermined weights of the keywords in each set of said form sheet types within said keyword register; and
one of said predetermined form sheet types having the highest degree of matching is determined to be the type of the input form sheet.
2 Assignments
0 Petitions
Accused Products
Abstract
A form sheet type determining method and apparatus for determining to which of predetermined form sheets an input form sheet corresponds. A plurality of sets of keywords are registered in a keyword register with one set of keywords for each predetermined form sheet type; image data of an input form sheet is read, character strings are extracted from the read image data, and character recognition is performed on each extracted character string; each of the character recognized strings is extracted as a keyword; the extracted keywords are collated, for each form sheet type, with the sets of keywords registered in the keyword register, thereby to determine the type of the input form sheet.
-
Citations
22 Claims
-
1. A form sheet type determining method comprising the steps of:
-
extracting each character string on an input form sheet as a keyword, after performing character recognition on the each character string; and collating the extracted keywords with a plurality of sets of keywords registered beforehand for each predetermined form sheet as one set of keywords in a keyword register, thereby to determine the type of said input form sheet, wherein each keyword in each set of keywords registered beforehand is registered in said keyword register in association with a predetermined corresponding weight, and wherein in said step of collating, each of said extracted keywords of said input form sheet is given a weight;
the degree of matching between said input form sheet and said predetermined form sheet types is evaluated for each predetermined form sheet type by using said weights of said extracted keywords and said predetermined weights of the keywords in each set of said form sheet types within said keyword register; and
one of said predetermined form sheet types having the highest degree of matching is determined to be the type of the input form sheet. - View Dependent Claims (2, 3)
-
-
4. A form sheet type determining method for determining to which of predetermined form sheet types an input form sheet corresponds, comprising the steps of:
-
registering a plurality of sets of keywords beforehand in a keyword register with one set of keywords for each of predetermined form sheet types; reading image data of an input form sheet, extracting character strings from the read image data, and performing character recognition on each of the extracted character strings; extracting each of said character-recognized character strings as a keyword; collating said extracted keywords, for each of the form sheet types, with said plurality of sets of keywords registered in said register, thereby to determine the type of said input form sheet, wherein in said keyword register, said each keyword in said sets of keywords is registered in association with a predetermined corresponding weight, and wherein in said step of collating, each of said extracted keywords of said input form sheet is attached with a weight;
the degree of matching between said input form sheet and said predetermined form sheet types is evaluated for each predetermined form sheet type by using said weights of said extracted keywords and said predetermined weights of the keywords in each set of said form sheet types within said keyword register; and
one of said predetermined form sheet types having the highest degree of matching is determined to be the type of the input form sheet. - View Dependent Claims (5, 6, 7, 8, 9)
-
-
10. A form sheet type determining apparatus for determining to which of predetermined form sheet types an input form sheet corresponds, comprising:
-
a keyword register which stores therein a plurality of sets of keywords, one set for each of predetermined form sheet types; a character recognition unit which reads image data of an input form sheet, extracts character strings from the read image data, and performs character recognition on each character string extracted; a keyword extraction unit which extracts as a keyword each of the character strings character-recognized by the character recognition unit; a collator which collates said extracted keywords, for each predetermined form sheet type, with each set of keywords of said plurality of sets of keywords registered in said keyword register to thereby determine the type of said input form sheet, wherein in said collator each of said extracted keywords is given a weight based on a type of characters constituting the extracted keyword. - View Dependent Claims (11)
-
-
12. A form sheet type determining apparatus for determining to which of predetermined form sheet types an input form sheet corresponds, comprising:
-
a keyword register which stores therein a plurality of sets of keywords one set for each of predetermined form sheet types; a character recognition unit which reads image data of an input form sheet, extracts character strings from the read image data, and performs character recognition on each character string extracted; a keyword extraction unit which extracts as a keyword each of the character strings character-recognized by the character recognition unit; a collator which collates said extracted keywords, for each predetermined form sheet type, with each set of keywords of said plurality of sets of keywords registered in said keyword register to thereby determine the type of said input form sheet, wherein in said collator each of said extracted keywords is given a weight in accordance with a location of the keyword on said input form sheet.
-
-
13. A form sheet type determining apparatus for determining to which of predetermined form sheet types an input form sheet corresponds, comprising:
-
a keyword register which stores therein a plurality of sets of keywords one set for each of predetermined form sheet types; a character recognition unit which reads image data of an input form sheet, extracts character strings from the read image data, and performs character recognition on each character string extracted; a keyword extraction unit which extracts as a keyword each of the character strings character-recognized by the character recognition unit; a collator which collates said extracted keywords, for each predetermined form sheet type, with each set of keywords of said plurality of sets of keywords registered in said keyword register to thereby determine the type of said input form sheet, wherein in said register each keyword in each set of keywords is registered in association with a corresponding keyword-specific weight for each form sheet type.
-
-
14. A form sheet type determining apparatus for determining to which of predetermined form sheet types an input form sheet corresponds, comprising:
-
a keyword register which stores therein a plurality of sets of keywords one set for each of predetermined form sheet types; a character recognition unit which reads image data of an input form sheet, extracts character strings from the read image data, and performs character recognition on each character string extracted; a keyword extraction unit which extracts as a keyword each of the character strings character-recognized by the character recognition unit; a collator which collates said extracted keywords, for each predetermined form sheet type, with each set of keywords of said plurality of sets of keywords registered in said keyword register to thereby determine the type of said input form sheet, wherein in said register each keyword in each set of keywords is registered in association with a predetermined weight, and wherein in said collator, each of said extracted keywords is attached with a weight, and said collator evaluates, for each form sheet type, the degree of matching between said input form sheet and said predetermined form sheet types by using said weights of said extracted keywords and said predetermined weight of each keyword in each set of said keywords within said keyword register to thereby decide that a form sheet type having a highest degree of matching is the form sheet type of said input form sheet. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. A computer program product comprising:
-
a computer usable medium having computer readable program code means embodied in said medium for determining whether an input form sheet is one of predetermined form sheet types, said computer readable program code means comprising; means for registering a plurality of sets of keywords for each of predetermined form sheet types as a set of keywords beforehand in a keyword register; means for reading image data of said input form sheet, extracting character strings from the read image data, and performing character recognition on each of the extracted character strings; and collating means for collating, for each form sheet type, said extracted keywords with said sets of keywords registered in said keyword register, thereby to determine the type of said input form sheet, wherein in said register means, each keyword in said sets of keywords is registered in association with a predetermined corresponding weight, and said collating means evaluates, for each form sheet type, the degree of matching between said input form sheet and said predetermined form sheet types by using the weights given to each of said extracted keywords and said predetermined weights of the keywords in each set of said keywords within said keyword register to thereby decide that a form sheet type having a highest degree of matching is the form sheet type of said input form sheet.
-
Specification