×

Methodology for OCR error checking through text image regeneration

  • US 5,889,897 A
  • Filed: 04/08/1997
  • Issued: 03/30/1999
  • Est. Priority Date: 04/08/1997
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of reducing errors in optical character recognition procedures comprising the steps of:

  • (a) digitizing an image of an object, said image containing alpha-numeric characters;

    (b) storing a bitmap of said digitized image as a scanned document file (SDF);

    (c) performing an OCR step to obtain at least one candidate character;

    (d) storing an indication of said at least one candidate character in a textural results file (TRF);

    (e) determining the font of said digitized image;

    (f) storing said determined font in a regeneration library file (RLF);

    (g) generating a regenerated image file using said TRF and said RLF;

    (h) comparing at least a portion of said regenerated image file with a corresponding portion of the bitmap of said digitized image stored in said scanned document file;

    (i) outputting said TRF if the results of said comparison step indicate a match of at least said portion of said regenerated image file with said corresponding portion of said bitmap in said scanned document; and

    (j) performing further processing to resolve the mismatch if a match is not found in step (i).

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×