Automatic training of character templates using a text line image, a text line transcription and a line image source model
First Claim
1. A method of operating a machine to train a set of bitmapped character templates for use in a recognition system;
- each of the bitmapped character templates being based on a character template model defining character image positioning referred to as a sidebearing model of character image positioning;
the machine including a processor and a memory device for storing data;
the data stored in the memory device including instruction data the processor executes to operate the machine;
the processor being connected to the memory device for accessing the data stored therein;
the method comprising;
operating the processor to receive and store an image definition data structure defining an image including a plurality of glyphs indicating a single line of text, hereafter referred to as a text line image source of glyph samples;
each glyph occurring in the text line image source of glyph samples being an image instance of a respective one of a plurality of characters in a character set, hereafter referred to as a glyph sample character set;
each one of the set of bitmapped character templates being trained representing a respective one of the plurality of characters in the glyph sample character set;
operating the processor to receive and store in the memory device a text line image source model data structure, hereafter referred to as a text line image source model;
the text line image source model modeling as a grammar a spatial image structure of a set of text line images;
the text line image source of glyph samples being one of the set of text line images modeled by the text line image source model;
the text line image source model including spatial positioning data modeling spatial positioning of the plurality of glyphs occurring in the text line image source of glyph samples;
operating the processor to determine, for each respective glyph occurring in the text line image source of glyph samples, an image coordinate position in the text line image source of glyph samples indicating an image origin position of the respective glyph using the spatial positioning data included in the text line image source model;
each image coordinate position being hereafter referred to as a glyph sample image origin position;
operating the processor to produce a glyph label data item paired with each glyph sample image origin position determined for the respective glyphs occurring in the text line image source of glyph samples;
each glyph label data item being hereafter referred to as a respectively paired glyph label;
each respectively paired glyph label indicating the character in the glyph sample character set represented by the respective glyph;
the processor, in producing each respectively paired glyph label, using mapping data included in the text line image source model mapping respective ones of the glyphs occurring in the text line image source of glyph samples to respectively paired glyph labels;
the processor, further in producing each respectively paired glyph label, using a text line transcription data structure associated with the text line image source of glyph samples, hereafter referred to as a transcription, including an ordered arrangement of transcription label data items;
the processor using the transcription and the mapping data to pair each glyph label with the respective glyph sample image origin position of a respective glyph occurring in the text line image source of glyph samples; and
operating the processor to produce the set of bitmapped character templates using the text line image source of glyph samples, the glyph sample image origin positions and the respectively paired glyph labels;
the processor determining, in each bitmapped character template produced, an image pixel position included therein indicating a template image origin position;
each bitmapped character template produced having a characteristic image positioning property such that, when a second bitmapped character template is positioned in an image with the template image origin position thereof displaced from the template image origin position of a preceding first bitmapped character template by a character set width thereof, and when a first bounding box entirely containing the first bitmapped character template overlaps in the image with a second bounding box entirely containing the second bitmapped character template, the first and second bitmapped character templates have substantially nonoverlapping foreground pixels.
4 Assignments
0 Petitions
Accused Products
Abstract
A technique for automatically producing, or training, a set of bitmapped character templates defined according to the sidebearing model of character image positioning uses as input a text line image of unsegmented characters, called glyphs, as the source of training samples. The training process also uses a transcription associated with the text line image, and an explicit, grammar-based text line image source model that describes the structural and functional features of a set of possible text line images that may be used as the source of training samples. The transcription may be a literal transcription of the line image, or it may be nonliteral, for example containing logical structure tags for document formatting and layout, such as found in markup languages. Spatial positioning information modeled by the text line image source model and the labels in the transcription are used to determine labeled image positions identifying the location of glyph samples occurring in the input line image, and the character templates are produced using the labeled image positions. In another aspect of the technique, a set of character templates defined by any character template model, such as a segmentation-based model, is produced using the grammar-based text line image source model and specifically using a tag transcription containing logical structure tags for document formatting and layout. Both aspects of the training technique may represent the text line image source model and the transcription as finite state networks.
-
Citations
21 Claims
-
1. A method of operating a machine to train a set of bitmapped character templates for use in a recognition system;
- each of the bitmapped character templates being based on a character template model defining character image positioning referred to as a sidebearing model of character image positioning;
the machine including a processor and a memory device for storing data;
the data stored in the memory device including instruction data the processor executes to operate the machine;
the processor being connected to the memory device for accessing the data stored therein;
the method comprising;operating the processor to receive and store an image definition data structure defining an image including a plurality of glyphs indicating a single line of text, hereafter referred to as a text line image source of glyph samples;
each glyph occurring in the text line image source of glyph samples being an image instance of a respective one of a plurality of characters in a character set, hereafter referred to as a glyph sample character set;
each one of the set of bitmapped character templates being trained representing a respective one of the plurality of characters in the glyph sample character set;operating the processor to receive and store in the memory device a text line image source model data structure, hereafter referred to as a text line image source model;
the text line image source model modeling as a grammar a spatial image structure of a set of text line images;
the text line image source of glyph samples being one of the set of text line images modeled by the text line image source model;
the text line image source model including spatial positioning data modeling spatial positioning of the plurality of glyphs occurring in the text line image source of glyph samples;operating the processor to determine, for each respective glyph occurring in the text line image source of glyph samples, an image coordinate position in the text line image source of glyph samples indicating an image origin position of the respective glyph using the spatial positioning data included in the text line image source model;
each image coordinate position being hereafter referred to as a glyph sample image origin position;operating the processor to produce a glyph label data item paired with each glyph sample image origin position determined for the respective glyphs occurring in the text line image source of glyph samples;
each glyph label data item being hereafter referred to as a respectively paired glyph label;
each respectively paired glyph label indicating the character in the glyph sample character set represented by the respective glyph;
the processor, in producing each respectively paired glyph label, using mapping data included in the text line image source model mapping respective ones of the glyphs occurring in the text line image source of glyph samples to respectively paired glyph labels;
the processor, further in producing each respectively paired glyph label, using a text line transcription data structure associated with the text line image source of glyph samples, hereafter referred to as a transcription, including an ordered arrangement of transcription label data items;
the processor using the transcription and the mapping data to pair each glyph label with the respective glyph sample image origin position of a respective glyph occurring in the text line image source of glyph samples; andoperating the processor to produce the set of bitmapped character templates using the text line image source of glyph samples, the glyph sample image origin positions and the respectively paired glyph labels;
the processor determining, in each bitmapped character template produced, an image pixel position included therein indicating a template image origin position;
each bitmapped character template produced having a characteristic image positioning property such that, when a second bitmapped character template is positioned in an image with the template image origin position thereof displaced from the template image origin position of a preceding first bitmapped character template by a character set width thereof, and when a first bounding box entirely containing the first bitmapped character template overlaps in the image with a second bounding box entirely containing the second bitmapped character template, the first and second bitmapped character templates have substantially nonoverlapping foreground pixels. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- each of the bitmapped character templates being based on a character template model defining character image positioning referred to as a sidebearing model of character image positioning;
-
12. A method of operating a machine to train a set of bitmapped character templates for use in a recognition system;
- the machine including a processor and a memory device for storing data;
the data stored in the memory device including instruction data the processor executes to operate the machine;
the processor being connected to the memory device for accessing the data stored therein;
the method comprising;operating the processor to receive and store an image definition data structure defining an image including a plurality of glyphs occurring therein indicating a single line of text, hereafter referred to as a text line image source of glyph samples;
each glyph occurring in the text line image source of glyph samples being an image instance of a respective one of a plurality of characters in a character set, hereafter referred to as a glyph sample character set;
each one of the set of bitmapped character templates being trained representing a respective one of the plurality of characters in the glyph sample character set;operating the processor to receive and store in the memory device a text line image source model data structure, hereafter referred to as a text line image source model;
the text line image source model modeling the text line image source of glyph samples as a grammar and including spatial positioning data modeling spatial positioning of the plurality of glyphs occurring in the text line image source of glyph samples;operating the processor to determine a plurality of glyph samples occurring in the text line image source of glyph samples using the spatial positioning data included in the text line image source model; operating the processor to produce a glyph label data item, hereafter referred to as a respectively paired glyph label, paired with each glyph sample occurring in the text line image source of glyph samples;
the respectively paired glyph label indicating the respective one of the characters in the glyph sample character set represented by the glyph sample;
the processor, in producing each respectively paired glyph label, using mapping data included in the text line image source model mapping a respective one of the glyphs occurring in the text line image source of glyph samples to a glyph label indicating the character in the glyph sample character set represented by the respective glyph;
the processor, further in producing each respectively paired glyph label, using a text line transcription data structure associated with the text line image source of glyph samples including an ordered arrangement of transcription label data items;
the text line transcription data structure including at least one nonliteral transcription label, hereafter referred to as a tag, indicating at least one character code representing a character with which a respective glyph in the text line image source of glyph samples cannot be paired by visual inspection thereof;
the at least one character code indicated by the tag indicating markup information about the text line image source of glyph samples;
the markup information, when interpreted by a document processing operation, producing at least one display feature included in the text line image source of glyph samples perceptible as a visual formatting characteristic of the text line image source of glyph samples;
the text line transcription data structure being hereafter referred to as a tag transcription;
the processor, in producing the respectively paired glyph label using the tag transcription and the mapping data,using the spatial positioning information about the plurality of glyphs to identify the glyph sample in the text line image of glyph samples related to the tag, and using the tag in producing the respectively paired glyph label paired with the glyph sample identified; and operating the processor to produce the set of bitmapped character templates indicating the characters in the glyph sample character set using the glyph samples identified by the respectively paired glyph labels. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
- the machine including a processor and a memory device for storing data;
Specification