×

Method and system for identifying lines of text in a document

  • US 5,307,422 A
  • Filed: 06/25/1991
  • Issued: 04/26/1994
  • Est. Priority Date: 06/25/1991
  • Status: Expired due to Term
First Claim
Patent Images

1. A document processing method which is capable of processing complex documents comprising regions of text and graphics comprising the steps ofscanning a document utilizing a scanner to form a digital image of the document,transmitting the digital image of the document to a recognition engine,electronically processing said digital image at said recognition engine by:

  • vertically dividing said digital image into columns of uniform width,identifying vertical transitions between substantially blank and substantially non-blank portions within each uniform width column based solely on the corresponding portion of said digital image within each respective column, andhorizontally dividing each uniform width column at said vertical transitions between substantially blank and substantially non-blank portions of each column to form uniform width units which units each contains one substantially non-blank portion of each column, said units being vertically separated from each other by said substantially blank portions of said columns,identifying units containing portions of text by eliminating units which have a height in pixels below a minimum threshold or a height above a maximum threshold,utilizing said recognition engine to link units from adjacent columns which overlap vertically to identify lines of text, andutilizing said recognition engine to arrange said units in a two dimensional array and storing in a memory associated with said recognition engine the height of each unit and information indicating the horizontal extend of each unit,wherein said step of eliminating units whose height is above a maximum threshold comprises utilizing said recognition engine to determine the average height of the units and multiplying by a tolerance factor to obtain said maximum threshold.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×