Iterative process for optimizing optical character recognition
First Claim
1. A computer-implemented method for optimizing parameter settings for an optical character recognition process, comprising:
- receiving a population comprising at least two sets of parameter settings;
receiving an image comprising known text;
for each set of parameter settings in the population, determining an accuracy score by;
executing a set of pre-recognition functions using the set of parameter settings to modify the image according to the parameter settings to produce a modified image;
executing an optical character recognition function on the known text in the modified image to produce an output; and
testing the output to determine the accuracy score for the set of parameter settings by comparing the known text to the output;
selecting a first set of parameter settings from the population based on the determined accuracy scores;
receiving a second image separate from the image;
executing the set of pre-recognition functions using the first set of parameter settings to modify the second image according to the first set of parameter settings to produce a modified second image; and
executing the optical character recognition function on the modified second image to produce a second output.
1 Assignment
0 Petitions
Accused Products
Abstract
The disclosed embodiments relate to a system and method for calibrating optical character recognition (OCR) processes for an image captured through a mobile computing device. During operation, the system adjusts the OCR process through pre-recognition functions, OCR functions and/or post-recognition functions with multiple sets of parameter settings. With each of these sets, the system scores the OCR process output against an image with known text. Once the sets are scored, the system sorts the sets of parameters, removes some sets, then mixes and mutates the remaining sets in a process akin to evolutionary biology. By repeating this procedure, the system produces a set of parameter settings that can be used to calibrate OCR processing.
-
Citations
22 Claims
-
1. A computer-implemented method for optimizing parameter settings for an optical character recognition process, comprising:
-
receiving a population comprising at least two sets of parameter settings; receiving an image comprising known text; for each set of parameter settings in the population, determining an accuracy score by; executing a set of pre-recognition functions using the set of parameter settings to modify the image according to the parameter settings to produce a modified image; executing an optical character recognition function on the known text in the modified image to produce an output; and testing the output to determine the accuracy score for the set of parameter settings by comparing the known text to the output; selecting a first set of parameter settings from the population based on the determined accuracy scores; receiving a second image separate from the image; executing the set of pre-recognition functions using the first set of parameter settings to modify the second image according to the first set of parameter settings to produce a modified second image; and executing the optical character recognition function on the modified second image to produce a second output. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-program product for use in conjunction with a computer system, the computer-program product comprising a computer-readable storage medium and a computer-program mechanism embedded therein, to optimize parameter settings for optical character recognition, the computer-program mechanism including:
-
instructions for receiving a population comprising at least two sets of parameter settings; instructions for receiving an image comprising known text; for each set of parameter settings in the population, instructions for determining an accuracy score by; executing a set of pre-recognition functions using the set of parameter settings to modify the image according to the parameter settings to produce a modified image; executing an optical character recognition function on the known text in the modified image to produce an output; and testing the output to determine the accuracy score for the set of parameter settings by comparing the known text to the output; instructions for selecting a first set of parameter settings from the population based on the determined accuracy scores; instructions for receiving a second image separate from the image; instructions for executing the set of pre-recognition functions using the first set of parameter settings to modify the second image according to the first set of parameter settings to produce a modified second image; and instructions for executing the optical character recognition function on the modified second image to produce a second output. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A computer system, comprising:
-
a processor; a memory; and a program module, wherein the program module is stored in the memory and configurable to be executed by the processor to optimize parameter settings for optical character recognition, the program module including; instructions for receiving a population comprising at least two sets of parameter settings; instructions for receiving an image comprising known text; for each set of parameter settings in the population, instructions for determining an accuracy score by; executing a set of pre-recognition functions using the set of parameter settings to modify the image according to the parameter settings to produce a modified image; executing an optical character recognition function on the known text in the modified image to produce an output; and testing the output to determine the accuracy score for the set of parameter settings by comparing the known text to the output; instructions for selecting a first set of parameter settings from the population based on the determined accuracy scores; instructions for receiving a second image separate from the image; instructions for executing the set of pre-recognition functions using the first set of parameter settings to modify the second image according to the first set of parameter settings to produce a modified second image; and instructions for executing the optical character recognition function on the modified second image to produce a second output. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
Specification