AUTOMATIC FILE NAME GENERATION IN OCR SYSTEMS
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and system for processing document images in OCR systems, particularly for selecting a proper file name for a recognized document. The method comprises generating at least one document type hypothesis for the document; verifying each document type hypothesis; selecting a best document type hypothesis and saving the document with a proper name based on the best type hypothesis and unique features. The method further includes determining a logical structure of a document and selecting a best document model hypothesis that has the best degree of correspondence with the selected best block hypotheses for the document. On the basis of the best document model hypothesis the text document reflecting the logical structure of the source document in extended computer-editable format is formed and saved with a proper file name.
36 Citations
42 Claims
-
1-20. -20. (canceled)
-
21. A method for processing a document comprising:
-
analyzing the document to determine a type of the document comprising generating at least one document type hypothesis; verifying the at least one document type hypothesis by searching for features, keywords, and structural elements that are distinctive for the type of the document and by searching for features which are unique for the document; selecting a best document type hypothesis from the at least one document type hypothesis; and saving in a memory device the document with a mine based on the best document type hypothesis and the found unique features. - View Dependent Claims (22, 23, 24, 25, 26, 27)
-
-
28. A system for processing a document comprising:
-
at least one memory; and at least a processor controlled by stored programmed instructions in the at least one memory to perform functions of analyzing the document to determine a type of the document comprising generating at least one document type hypothesis; verifying the at least one document type hypothesis by searching for features, keywords, and structural elements that are distinctive for the type of the document and by searching for features which are unique for the document; selecting a best document type hypothesis from the at least one document type hypothesis; and saving in a memory device the document with a name based on the best document type hypothesis and the found unique features. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35)
-
-
36. A computer readable medium containing a computer program product for processing a document, the computer program product comprising:
-
program code for analyzing the document to determine a type of the document comprising program code for generating at least one document type hypothesis; program code for verifying the at least one document type hypothesis by searching for features, keywords, and structural elements that are distinctive for the type of the document and by searching for features which are unique for the document; program code for selecting a best document type hypothesis from the at least one document type hypothesis; and program code for saving in a memory device the document with a name based on the best document type hypothesis and the found unique features. - View Dependent Claims (37, 38, 39, 40, 41, 42)
-
Specification