×

Document image decoding using an integrated stochastic language model

  • US 6,678,415 B1
  • Filed: 05/12/2000
  • Issued: 01/13/2004
  • Est. Priority Date: 05/12/2000
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for operating a processor-controlled machine to perform a decoding operation to decode a text line image using a language model;

  • the method comprising the steps of;

    receiving an input text line image including a plurality of image glyphs each indicating a character symbol;

    representing the input text line image as an image network data structure indicating a plurality of nodes and branches between nodes;

    each node indicating a location of an image glyph;

    each branch leading into a node being associated with a character symbol identifying the image glyph;

    the plurality of nodes and branches indicating a plurality of possible paths through the image network;

    each path indicating a possible transcription of the input text line image;

    assigning a language model score computed from a language model to each branch in the image network according to the character symbol associated with the branch;

    the language model score indicating a validity measurement for a character symbol sequence ending with the character symbol associated with the branch;

    performing a repeated sequence of a best path search operation followed by a network expansion operation until a stopping condition is met;

    the best path search operation producing a complete path of branches and nodes through the image network using the language model scores assigned to the branches;

    the network expansion operation including adding at least one context node and context branch to the image network;

    the context node having a character history associated therewith;

    the context branch indicating an updated language model score for the character history ending with the character symbol associated with the context branch;

    the image network with the context node and context branch added thereto being available to a subsequent execution of the best path search operation; and

    when the stopping condition is met, producing the transcription of the character symbols represented by the image glyphs of the input text line image using the character symbols associated with the branches of the complete path.

View all claims
  • 7 Assignments
Timeline View
Assignment View
    ×
    ×