×

Automatic training of character templates using a text line image, a text line transcription and a line image source model

  • US 5,594,809 A
  • Filed: 04/28/1995
  • Issued: 01/14/1997
  • Est. Priority Date: 04/28/1995
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of operating a machine to train a set of bitmapped character templates for use in a recognition system;

  • each of the bitmapped character templates being based on a character template model defining character image positioning referred to as a sidebearing model of character image positioning;

    the machine including a processor and a memory device for storing data;

    the data stored in the memory device including instruction data the processor executes to operate the machine;

    the processor being connected to the memory device for accessing the data stored therein;

    the method comprising;

    operating the processor to receive and store an image definition data structure defining an image including a plurality of glyphs indicating a single line of text, hereafter referred to as a text line image source of glyph samples;

    each glyph occurring in the text line image source of glyph samples being an image instance of a respective one of a plurality of characters in a character set, hereafter referred to as a glyph sample character set;

    each one of the set of bitmapped character templates being trained representing a respective one of the plurality of characters in the glyph sample character set;

    operating the processor to receive and store in the memory device a text line image source model data structure, hereafter referred to as a text line image source model;

    the text line image source model modeling as a grammar a spatial image structure of a set of text line images;

    the text line image source of glyph samples being one of the set of text line images modeled by the text line image source model;

    the text line image source model including spatial positioning data modeling spatial positioning of the plurality of glyphs occurring in the text line image source of glyph samples;

    operating the processor to determine, for each respective glyph occurring in the text line image source of glyph samples, an image coordinate position in the text line image source of glyph samples indicating an image origin position of the respective glyph using the spatial positioning data included in the text line image source model;

    each image coordinate position being hereafter referred to as a glyph sample image origin position;

    operating the processor to produce a glyph label data item paired with each glyph sample image origin position determined for the respective glyphs occurring in the text line image source of glyph samples;

    each glyph label data item being hereafter referred to as a respectively paired glyph label;

    each respectively paired glyph label indicating the character in the glyph sample character set represented by the respective glyph;

    the processor, in producing each respectively paired glyph label, using mapping data included in the text line image source model mapping respective ones of the glyphs occurring in the text line image source of glyph samples to respectively paired glyph labels;

    the processor, further in producing each respectively paired glyph label, using a text line transcription data structure associated with the text line image source of glyph samples, hereafter referred to as a transcription, including an ordered arrangement of transcription label data items;

    the processor using the transcription and the mapping data to pair each glyph label with the respective glyph sample image origin position of a respective glyph occurring in the text line image source of glyph samples; and

    operating the processor to produce the set of bitmapped character templates using the text line image source of glyph samples, the glyph sample image origin positions and the respectively paired glyph labels;

    the processor determining, in each bitmapped character template produced, an image pixel position included therein indicating a template image origin position;

    each bitmapped character template produced having a characteristic image positioning property such that, when a second bitmapped character template is positioned in an image with the template image origin position thereof displaced from the template image origin position of a preceding first bitmapped character template by a character set width thereof, and when a first bounding box entirely containing the first bitmapped character template overlaps in the image with a second bounding box entirely containing the second bitmapped character template, the first and second bitmapped character templates have substantially nonoverlapping foreground pixels.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×