Method and system for automatic transcription correction
First Claim
1. A method of operating a system to correct errors in a transcription of a text image;
- the system including a processor and a memory device for storing data;
the data stored in the memory device including instruction data the processor executes to operate the system;
the processor being connected to the memory device for accessing the data stored therein;
the method comprising;
operating the processor to obtain a formal two-dimensional image source model data structure, hereafter referred to as a 2D image model, modeling as a grammar a set of two-dimensional (2D) text images;
each 2D text image including a plurality of glyphs occurring therein;
each glyph being an image instance of a respective one of a plurality of characters in an input image character set;
the 2D image model including mapping data indicating a mapping between a glyph occurring in a 2D text image and a respective message string identifying a character in the input image character set;
operating the processor to obtain an image definition data structure defining a two-dimensional text image, hereafter referred to as an input 2D image of glyphs, including a plurality of glyphs occurring therein representing characters in the input image character set;
the input 2D image of glyphs having a vertical dimension size larger than a single line of glyphs;
the input 2D image of glyphs being one of the set of 2D text images modeled by the 2D image model;
operating the processor to obtain a first transcription data structure, hereafter referred to as a first transcription, associated with the input 2D image of glyphs;
the first transcription including a first ordered arrangement of transcription labels identifying characters in the input image character set represented by the glyphs occurring in the input 2D image of glyphs;
the first transcription including at least one transcription error;
operating the processor to modify the mapping data included in the 2D image model using the transcription labels in the first transcription to produce modified mapping data included in a modified 2D image model; and
operating the processor to perform a recognition operation on the input 2D image of glyphs using the modified mapping data included in the modified 2D image model;
the modified mapping data mapping a sequence of glyphs occurring in a 2D text image to a sequence of respective message strings identifying characters in the input image character set;
the sequence of message strings produced by the modified mapping data indicating a second transcription identifying the characters represented by the glyphs occurring in the input 2D image of glyphs and including a message string indicating a correction of the at least one transcription error in the first transcription;
wherein the glyphs included in the input 2D image of glyphs are perceptible as appearing in a visually consistent character image design, hereafter referred to as an input image font;
wherein the mapping data included in the 2D image model includes a first set of character templates;
wherein operating the processor to modify the mapping data included in the 2D image model includesproducing character template training data including a plurality of glyph samples and respectively paired glyph labels for each character in the input image character set;
each glyph sample being included in the input 2D image of glyphs;
each respectively paired glyph label being produced using the first transcription and indicating a respective one of the characters in the input image character set; and
producing a second set of character templates using the character template training data;
the second set of character templates indicating character images of the characters in the input image character set and being perceptible as appearing in the input image font; and
wherein performing the recognition operation on the input 2D image of glyphs using the modified 2D image model includes mapping each glyph occurring in the input 2D image of glyphs to a respective message string identifying the character in the input image character set using the second set of character templates appearing in the input image font.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system for automatically modifying an original transcription produced as the output of a recognition operation produces a second, modified transcription, such as, for example, automatically correcting an errorful transcription produced by an OCR operation. The invention uses information in an input text image of character images and in an original transcription associated with the input text image to modify aspects of a formal image source model that models as a grammar the spatial image structure of a set of text images. A recognition operation is then performed on the input text image using the modified formal image source model to produce a second, modified transcription. When the original transcription is errorful, the second transcription is a corrected transcription. Several aspects of the formal image source model may be modified; in particular, character templates to be used in the recognition operation are trained in the font of the glyphs occurring in the input text image. When errors in the original transcription are caused by matching glyphs against templates that are inadequately specified for the given input text image, the subsequently performed recognition operation on the text image using the trained, font-specific character templates produces a more accurate transcription.
239 Citations
15 Claims
-
1. A method of operating a system to correct errors in a transcription of a text image;
- the system including a processor and a memory device for storing data;
the data stored in the memory device including instruction data the processor executes to operate the system;
the processor being connected to the memory device for accessing the data stored therein;
the method comprising;operating the processor to obtain a formal two-dimensional image source model data structure, hereafter referred to as a 2D image model, modeling as a grammar a set of two-dimensional (2D) text images;
each 2D text image including a plurality of glyphs occurring therein;
each glyph being an image instance of a respective one of a plurality of characters in an input image character set;
the 2D image model including mapping data indicating a mapping between a glyph occurring in a 2D text image and a respective message string identifying a character in the input image character set;operating the processor to obtain an image definition data structure defining a two-dimensional text image, hereafter referred to as an input 2D image of glyphs, including a plurality of glyphs occurring therein representing characters in the input image character set;
the input 2D image of glyphs having a vertical dimension size larger than a single line of glyphs;
the input 2D image of glyphs being one of the set of 2D text images modeled by the 2D image model;operating the processor to obtain a first transcription data structure, hereafter referred to as a first transcription, associated with the input 2D image of glyphs;
the first transcription including a first ordered arrangement of transcription labels identifying characters in the input image character set represented by the glyphs occurring in the input 2D image of glyphs;
the first transcription including at least one transcription error;operating the processor to modify the mapping data included in the 2D image model using the transcription labels in the first transcription to produce modified mapping data included in a modified 2D image model; and operating the processor to perform a recognition operation on the input 2D image of glyphs using the modified mapping data included in the modified 2D image model;
the modified mapping data mapping a sequence of glyphs occurring in a 2D text image to a sequence of respective message strings identifying characters in the input image character set;
the sequence of message strings produced by the modified mapping data indicating a second transcription identifying the characters represented by the glyphs occurring in the input 2D image of glyphs and including a message string indicating a correction of the at least one transcription error in the first transcription;wherein the glyphs included in the input 2D image of glyphs are perceptible as appearing in a visually consistent character image design, hereafter referred to as an input image font; wherein the mapping data included in the 2D image model includes a first set of character templates; wherein operating the processor to modify the mapping data included in the 2D image model includes producing character template training data including a plurality of glyph samples and respectively paired glyph labels for each character in the input image character set;
each glyph sample being included in the input 2D image of glyphs;
each respectively paired glyph label being produced using the first transcription and indicating a respective one of the characters in the input image character set; andproducing a second set of character templates using the character template training data;
the second set of character templates indicating character images of the characters in the input image character set and being perceptible as appearing in the input image font; andwherein performing the recognition operation on the input 2D image of glyphs using the modified 2D image model includes mapping each glyph occurring in the input 2D image of glyphs to a respective message string identifying the character in the input image character set using the second set of character templates appearing in the input image font. - View Dependent Claims (2, 3, 4, 5, 6)
- the system including a processor and a memory device for storing data;
-
7. A method of operating a system to correct errors in a transcription of a text image;
- the system including a processor and a memory device for storing data;
the data stored in the memory device including instruction data the processor executes to operate the system;
the processor being connected to the memory device for accessing the data stored therein;
the method comprising;operating the processor to obtain a formal two-dimensional image source model data structure, hereafter referred to as a 2D image model, modeling as a grammar a set of two-dimensional (2D) text images;
each 2D text image including a plurality of glyphs occurring therein;
each glyph being an image instance of a respective one of a plurality of characters in an input image character set;
the 2D image model including mapping data indicating a mapping between a glyph occurring in a 2D text image and a respective message string identifying a character in the input image character set;operating the processor to obtain an image definition data structure defining a two-dimensional text image, hereafter referred to as an input 2D image of glyphs, including a plurality of glyphs occurring therein representing characters in the input image character set;
the input 2D image of glyphs having a vertical dimension size larger than a single line of glyphs;
the input 2D image of glyphs being one of the set of 2D text images modeled by the 2D image model;operating the processor to obtain a first transcription data structure, hereafter referred to as a first transcription, associated with the input 2D image of glyphs;
the first transcription including a first ordered arrangement of transcription labels identifying characters in the input image character set represented by the glyphs occurring in the input 2D image of glyphs;
the first transcription including at least one transcription error;operating the processor to modify the mapping data included in the 2D image model using the transcription labels in the first transcription to produce modified mapping data included in a modified 2D image model; and operating the processor to perform a recognition operation on the input 2D image of glyphs using the modified mapping data included in the modified 2D image model;
the modified mapping data mapping a sequence of glyphs occurring in a 2D text image to a sequence of respective message strings identifying characters in the input image character set;
the sequence of message strings produced by the modified mapping data indicating a second transcription identifying the characters represented by the glyphs occurring in the input 2D image of glyphs and including a message string indicating a correction of the at least one transcription error in the first transcription;wherein operating the processor to modify the mapping data included in the 2D image model includes constructing a language model using the transcription labels included in the first transcription;
the language model modeling as a grammar the ordered arrangement of the transcription labels indicated by the first transcription as at least two sequences of transcription labels;
one of the sequences of transcription labels indicating the ordered arrangement of the transcription labels indicated by the first transcription; andcombining the language model with the mapping data included in the 2D image model to produce the modified mapping data included in the modified 2D image model;
the modified mapping data constraining the mapping between a glyph occurring in a 2D text image and a respective message string identifying a character in the input image character set to map a sequence of glyphs to a sequence of respective message strings indicated by the language model and identifying characters in the input image character set;
the sequence of respective message strings produced by the modified mapping data being one of the at least two sequences of transcription labels occurring in the first transcription. - View Dependent Claims (8, 9, 10, 11)
- the system including a processor and a memory device for storing data;
-
12. A method of operating a system to correct errors in a transcription of a text image;
- the system including a processor and a memory device for storing data;
the data stored in the memory device including instruction data the processor executes to operate the system;
the processor being connected to the memory device for accessing the data stored therein;
the method comprising;operating the processor to obtain a stochastic finite state network data structure, hereafter referred to as a two-dimensional (2D) image network;
the 2D image network modeling as a grammar a set of 2D text images, each including a plurality of glyphs;
the 2D image network including a first set of character templates representing character images in an input image character set;
a representative one of the set of 2D text images being modeled as at least one path through the 2D image network;
the at least one path indicating path data items associated therewith and accessible by the processor;
the path data items indicating character templates included in the first set of character templates, image origin positions, and message strings such that the at least one path through the 2D image network maps respective ones of the plurality of glyphs included in the representative image to message strings indicating characters in the input image character set;operating the processor to obtain an image definition data structure defining a two-dimensional text image, hereafter referred to as an input 2D image of glyphs, including a plurality of glyphs occurring therein representing characters in the input image character set;
the input 2D image of glyphs having a vertical dimension size larger than a single line of glyphs;
the input 2D image of glyphs being one of the set of 2D text images modeled by the 2D image model;operating the processor to obtain a first transcription data structure, hereafter referred to as a first transcription, associated with the input 2D image of glyphs;
the first transcription including a first ordered arrangement of transcription labels identifying characters in the input image character set represented by the glyphs occurring in the input 2D image of glyphs;
the first transcription including at least one transcription error;operating the processor to produce a second set of character templates using the 2D image model;
the second set of character templates being produced using character template training data produced using the first transcription and the input 2D image;operating the processor to construct a language model network represented as a finite state network data structure using the transcription labels included in the first transcription;
the language model network modeling a plurality of sequences of transcription labels occurring in the first transcription as a series of transcription nodes and a sequence of transitions between pairs of the transcription nodes;
each transition in the language model network having a transcription label associated therewith;
a sequence of transitions, called a language model path, through the language model network indicating one of the plurality of sequences of transcription labels;operating the processor to merge the series of nodes of the 2D image network with the series of transcription nodes of the language model network to produce a language-image network;
the transcription labels in the language model network providing the message strings associated with transitions in the language-image network; andoperating the processor to perform a decoding operation on the input 2D image of glyphs using the language-image network including the second set of character templates to produce at least one complete language-image path through the language-image network;
the language-image network mapping a plurality of glyphs included in the input 2D image of glyphs to a sequence of message strings indicating characters in the input image character set such that the sequence of message strings indicates one of the plurality of sequences of transcription labels indicated by a language model path through the language model network and the sequence of message strings includes a message string that indicates a correction for the at least one transcription error;
the sequence of message strings indicating a second, corrected transcription. - View Dependent Claims (13, 14)
- the system including a processor and a memory device for storing data;
-
15. A method of operating a system to correct errors in a transcription of a text image;
- the system including a processor and a memory device for storing data;
the data stored in the memory device including instruction data the processor executes to operate the system;
the processor being connected to the memory device for accessing the data stored therein;
the method comprising;operating the processor to obtain a formal two-dimensional image source model data structure, hereafter referred to as a 2D image model, modeling as a grammar a set of two-dimensional (2D) text images;
each 2D text image including a plurality of glyphs occurring therein;
each glyph being an image instance of a respective one of a plurality of characters in an input image character set;
the 2D image model including mapping data indicating a mapping between a glyph occurring in a 2D text image and a respective message string identifying a character in the input image character set;operating the processor to obtain an image definition data structure defining a two-dimensional text image, hereafter referred to as an input 2D image of glyphs, including a plurality of glyphs occurring therein representing characters in the input image character set;
the input 2D image of glyphs having a vertical dimension size larger than a single line of glyphs;
the input 2D image of glyphs being one of the set of 2D text images modeled by the 2D image model;operating the processor to obtain a first transcription data structure, hereafter referred to as a first transcription, associated with the input 2D image of glyphs;
the first transcription including a first ordered arrangement of transcription labels identifying characters in the input image character set represented by the glyphs occurring in the input 2D image of glyphs;
the first transcription including at least one transcription error;operating the processor to modify the mapping data included in the 2D image model using the transcription labels in the first transcription to produce modified mapping data included in a modified 2D image model; and operating the processor to perform a recognition operation on the input 2D image of glyphs using the modified mapping data included in the modified 2D image model;
the modified mapping data mapping a sequence of glyphs occurring in a 2D text image to a sequence of respective message strings identifying characters in the input image character set;
the sequence of message strings produced by the modified mapping data indicating a second transcription identifying the characters represented by the glyphs occurring in the input 2D image of glyphs and including a message string indicating a correction of the at least one transcription error in the first transcription;wherein the step of operating the processor to modify the mapping data included in the 2D image model includes operating the processor to perform a constraining operation on the 2D image model using the first transcription to produce the modified mapping data;
the modified mapping data included in the 2D image model being capable of producing a representative transcription of a 2D text image that indicates the first ordered arrangement of transcription labels included in the first transcription;
the constraining operation and the modified mapping data together preventing the recognition operation from producing all possible message strings capable of being produced by the mapping data of the 2D image model in an unmodified form and limiting the recognition operation to producing a second transcription identifying only the characters represented by the glyphs occurring in the input 2D image of glyphs.
- the system including a processor and a memory device for storing data;
Specification