Apparatus and method for use in image processing
First Claim
Patent Images
1. An optical character recognition system comprising:
- scanning means for optically scanning a document to produce a grey level image thereof;
edge extractor means comprising;
identifier means for identifying points along an edge within said grey level image using grey level values so that said points so identified represent substantially the strongest edge;
tracking means for automatically tracking the edge using grey level values to determine if the edge forms a closed loop and if so defining the edge as an outline,said identifier means identifying alternate points of the edge if the edge does not form a closed loop and said tracking means automatically tracking an alternate edge associated with said alternate points together with at least some of said points on said strongest edge and determining whether the alternate edge forms a closed loop and if so defining the alternate edge as the outline; and
means for producing data indicative of an object based on at least one outline identified in said image, each outline comprising at least a part of one character; and
processing means for processing the data provided by said edge extractor means to produce an output representative of the characters in said image.
3 Assignments
0 Petitions
Accused Products
Abstract
Optical character recognition is achieved by a system which comprises a scanner for scanning a document, an edge extractor for identifying edges in the image produced by the scanner to produce an outline of each object identified in the image, a segmentation facility for grouping the object outlines into blocks, means for identifying features of the outlines, and a final classification stage for providing data in an appropriate format representative of the characters in the image. Also disclosed are a novel edge extractor, a novel page segmentation facility and a novel feature extraction facility.
114 Citations
34 Claims
-
1. An optical character recognition system comprising:
-
scanning means for optically scanning a document to produce a grey level image thereof; edge extractor means comprising; identifier means for identifying points along an edge within said grey level image using grey level values so that said points so identified represent substantially the strongest edge; tracking means for automatically tracking the edge using grey level values to determine if the edge forms a closed loop and if so defining the edge as an outline, said identifier means identifying alternate points of the edge if the edge does not form a closed loop and said tracking means automatically tracking an alternate edge associated with said alternate points together with at least some of said points on said strongest edge and determining whether the alternate edge forms a closed loop and if so defining the alternate edge as the outline; and means for producing data indicative of an object based on at least one outline identified in said image, each outline comprising at least a part of one character; and processing means for processing the data provided by said edge extractor means to produce an output representative of the characters in said image. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
- 18. An image processing system comprising an edge extractor for automatically producing data indicative of the outline of objects in a grey level image generated by an optical scanning device, wherein said outline comprises at least part of one or more characters, said grey level image comprising a pixel array, said edge extractor comprising means for sequentially scanning said pixel array to locate a first strongest edge and for tracing the outline of one of the objects starting with said first strongest edge and by locating a plurality of additional strongest edges contiguous with one another to determine whether or not the entire outline can be traced using only said first and additional strongest edges and automatically locating alternate edges that are used to complete the entire outline when said additional strongest edges do not complete the outline, said edge extractor further comprising edge operator means for generating edge values for each pixel using grey level values, wherein the operation of the edge extractor is such that the edge value for a current pixel is derived from the relative values of the grey level values of the pixels which are immediately adjacent to the current pixel based on the direction of tracing.
- 23. An image processing system comprising means for scanning a document to generate an image represented by grey level values and automatically producing data representative of an outline of an object in the image comprising an edge extractor for tracing said outline by generating an edge value for a pixel derived from the grey level values of the pixels which are immediately adjacent to said pixel based on the direction of tracing, said edge extractor locating a strongest edge based on said edge values and tracing said outline by locating additional strongest edges if said additional strongest edges in combination complete the outline, said edge extractor automatically locating an alternate edge to complete the outline when said additional strongest edges do not complete the outline, said system further comprising a page segmentation facility for identifying groups of outlines to form said groups into blocks of outlines and for arranging said blocks into order, wherein said outlines are representative of at least a portion of one or more characters.
- 28. An image processing system comprising means for scanning a document to generate an image represented by grey level values and automatically producing output data representative of an outline of an object in the image, the system comprising an edge extractor for tracing said outline by generating an edge value for a pixel derived from the grey level values of pixels immediately adjacent to said pixel relative to the direction of tracing, said edge extractor locating a strongest edge based on said edge values and tracing said outline by locating additional strongest edges if said additional strongest edges in combination complete the outline, said edge extractor automatically locating an alternate edge to complete the outline when said additional strongest edges do not complete the outline, said system further comprising a facility for processing said data to identify particular features of each outline and to classify the outlines into classes of characters likely to represent the character for each outline in accordance with the identified features, and for classifying as spurious outlines those outlines not representing a character, wherein the features identified comprise one or more of concavity, closure, axis, symmetry and line.
-
31. A method of processing a document to produce data representative of characters on said document, said method comprising the steps of:
-
scanning said document to produce a pixel array of grey level values representing a grey level image of the document; scanning the grey level values and for each grey level value comparing the grey level values of pixels immediately on either side of the pixel being scanned; locating a first edge associated with one of the pixels based on the grey level values so compared; automatically locating additional edges and tracing a strongest edge in said grey level image representing an outline comprising at least a portion of one or more characters in said image by comparing the grey level values immediately adjacent the pixel associated with each said additional edge; and determining whether or not said strongest edge forms a complete outline; automatically identifying an alternate edge that does form a complete outline when said strongest edge does not form a complete outline, wherein the alternate edge is identified by comparing grey level values; processing data representing said strongest and said alternate edges that form complete outlines to provide data representative of the characters in said image, said processing step comprising automatically segmenting and classifying the data into blocks representative of a plurality of outlines, each outline comprising at least a portion of one or more characters. - View Dependent Claims (32, 33, 34)
-
Specification