Fast key-in for machine-printed OCR-based systems
First Claim
1. A method for correcting results of optical character recognition (OCR) comprising:
- scanning a document image;
performing OCR classification on the document image;
clustering symbols from the OCR classification based on shapes of the symbols;
creating super symbols based on first differences in the shapes of the clustered symbols exceeding a first threshold and that provides a display that emphasizes localized differences in similar symbols;
displaying a carpet of super symbols for analysis testing;
and performing one of;
storing the clustered symbols when the carpet of super symbols passes all of the analysis testing;
creating additional sub groups of super symbols based on second differences in the shapes of the clustered symbols exceeding a second threshold and returning to the displaying step when the carpet of super symbols passes most of the analysis testing; and
rejecting the clustered symbols and storing the rejected clustered symbols by manually keying-in the symbols when the carpet of super symbols fails most of the analysis testing.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for correcting results of OCR or other scanned symbols. Initially scanning and performing OCR classification on a document. Clustering character/symbol classifications resulting from the OCR based on shapes. Creating super-symbols based on at least a first difference in the shapes of the clustered characters/symbols exceeding a first threshold. A carpet of super-symbols, emphasizing localized differences in similar symbols, is displayed for analysis testing. Depending on results of analysis testing, performing one of: (1) storing the clustered symbols when the carpet of super-symbols passes all of the analysis testing; (2) creating additional super-symbols based on at least a second difference in the shapes of the clustered symbols exceeding a second threshold and returning to analysis testing when the carpet of super-symbols passes most of the analysis testing; and (3) rejecting the clustered symbols when the carpet of super-symbols fails most of the analysis testing and manually keying-in the symbols.
-
Citations
1 Claim
-
1. A method for correcting results of optical character recognition (OCR) comprising:
-
scanning a document image; performing OCR classification on the document image; clustering symbols from the OCR classification based on shapes of the symbols; creating super symbols based on first differences in the shapes of the clustered symbols exceeding a first threshold and that provides a display that emphasizes localized differences in similar symbols; displaying a carpet of super symbols for analysis testing; and performing one of; storing the clustered symbols when the carpet of super symbols passes all of the analysis testing; creating additional sub groups of super symbols based on second differences in the shapes of the clustered symbols exceeding a second threshold and returning to the displaying step when the carpet of super symbols passes most of the analysis testing; and rejecting the clustered symbols and storing the rejected clustered symbols by manually keying-in the symbols when the carpet of super symbols fails most of the analysis testing.
-
Specification