Method for image segmentation and classification of image elements for documents processing
First Claim
Patent Images
1. Method for removing unwanted information, lines or printed characters from documents prior to character recognition of written information, comprising the steps of:
- 1) segmentation of an image into image elements;
searching each image element to determine if it comprises more than one image element by scanning a pixel array in a horizontal and a vertical direction, and identifying a common border between two parallel pixel runs, said common border having a length below a threshold value;
cutting a connection between said two parallel runs at said common border to break an image element having said common border into several image elements;
2) extraction of feature information from each image element;
3) classification of each of the image elements;
4) removal of those image elements which are classified as unwanted information, lines and printed characters; and
5) processing remaining image elements for writing recognition.
0 Assignments
0 Petitions
Accused Products
Abstract
A method to segment, classify and clean an image is presented. It may be used in applications which have image data as their input that contains different classes of elements. The method will find, separate and classify those elements. Only significant elements must be kept for further processing and thus the amount of processed data may be significantly reduced.
71 Citations
14 Claims
-
1. Method for removing unwanted information, lines or printed characters from documents prior to character recognition of written information, comprising the steps of:
-
1) segmentation of an image into image elements; searching each image element to determine if it comprises more than one image element by scanning a pixel array in a horizontal and a vertical direction, and identifying a common border between two parallel pixel runs, said common border having a length below a threshold value; cutting a connection between said two parallel runs at said common border to break an image element having said common border into several image elements; 2) extraction of feature information from each image element; 3) classification of each of the image elements; 4) removal of those image elements which are classified as unwanted information, lines and printed characters; and 5) processing remaining image elements for writing recognition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
Specification