Fast Key-In for Machine-Printed OCR-based Systems
First Claim
1. A method for correcting results of optical character recognition (OCR) comprising:
- scanning a document image;
performing OCR classification on the document image;
clustering symbols from the OCR classification based on shapes of the symbols;
creating super symbols based on first differences in the shapes of the clustered symbols exceeding a first threshold and that provides a display that emphasizes localized differences in similar symbols;
displaying a carpet of super symbols for analysis testing;
storing the clustered symbols when the carpet of super symbols passes all of the analysis testing;
creating additional sub groups of super symbols based on second differences in the shapes of clustered symbols exceeding a second threshold and returning to displaying step when the carpet of super symbols passes most of the analysis testing;
rejecting clustered symbols when the carpet of super symbols fails most of the analysis testing; and
storing rejected clustered symbols by manually keying-in the symbols.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for correcting results of OCR or other scanned symbols that initially scans a document, performs OCR classification on the scanned document. Clustering of characters/symbols classifications that results from the OCR based on shapes is followed by the creation of super-symbols based on at least first differences in the shapes of the clustered characters/symbols exceeding a first threshold. A carpet of the super symbols is displayed for analysis testing and that provides a display that emphasizes localized differences in similar symbols. Depending on the results of the analysis testing, the carpet of super symbols is one of: (1) stored as a cluster symbols when the carpet of super-symbols passes all of the analysis testing; (2) used to create additional super symbols based on at least a second difference in the shapes of clustered symbols exceeding a second threshold and returning the additional super symbols to the displaying step for further analysis testing; and (3) rejected as a clustered of symbols when the super symbols fail most of the analysis testing and stored by manually keying-in the symbols.
15 Citations
1 Claim
-
1. A method for correcting results of optical character recognition (OCR) comprising:
-
scanning a document image; performing OCR classification on the document image; clustering symbols from the OCR classification based on shapes of the symbols; creating super symbols based on first differences in the shapes of the clustered symbols exceeding a first threshold and that provides a display that emphasizes localized differences in similar symbols; displaying a carpet of super symbols for analysis testing; storing the clustered symbols when the carpet of super symbols passes all of the analysis testing; creating additional sub groups of super symbols based on second differences in the shapes of clustered symbols exceeding a second threshold and returning to displaying step when the carpet of super symbols passes most of the analysis testing; rejecting clustered symbols when the carpet of super symbols fails most of the analysis testing; and storing rejected clustered symbols by manually keying-in the symbols.
-
Specification