×

Document image decoding using text line column-based heuristic scoring

  • US 6,738,518 B1
  • Filed: 05/12/2000
  • Issued: 05/18/2004
  • Est. Priority Date: 05/12/2000
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for operating a processor-controlled machine to decode a text line image;

  • the machine including a processor and a memory device for storing data;

    the data stored in the memory device including instruction data the processor executes to operate the machine;

    the processor being connected to the memory device for accessing and executing the instruction data stored therein;

    the method comprising;

    receiving an input text line image indicating a bitmapped image region including a plurality of image glyphs each indicating a character symbol;

    obtaining a plurality of character templates and character labels stored in the memory device of the machine;

    each character template indicating a two-dimensional bitmapped image of a character symbol;

    each character template further indicating a character label identifying the character symbol represented by the character template;

    producing a one-dimensional (1D) image analogue data structure using pixel counts of image foreground pixels in columns of the image portion of the input text line image;

    producing a plurality of 1D template analogue data structures using pixel counts of template foreground pixels in columns of a respective one of the plurality of character templates;

    computing a plurality of template-image heuristic scores using the 1D image analogue data structure and the plurality of 1D template analogue data structures;

    each template-image heuristic score indicating an estimated measurement of a match between one of the plurality of character templates and a two-dimensional region of the image portion of the input text line image; and

    performing a dynamic programming operation using a decoding trellis data structure indicating a stochastic finite state network including nodes and transitions between nodes indicating a model of expected spatial arrangements of character symbols in the input text line image;

    the dynamic programming operation using the plurality of template-image heuristic scores to decode the input text line image and produce the character labels of the character symbols represented by the image glyphs included therein.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×