×

System for document layout analysis

  • US 5,784,487 A
  • Filed: 05/23/1996
  • Issued: 07/21/1998
  • Est. Priority Date: 05/23/1996
  • Status: Expired due to Term
First Claim
Patent Images

1. A document layout analysis method for determining document structure data from input data including the content and characteristics of regions of at least one page forming the document, the method including the steps of:

  • identifying particular types of sections within the page, wherein the step of identifying sections within the page further comprises the steps ofidentifying rows and columns within a table,Identifying headers and footers on a page, andclassifying the headers and footers based upon a horizontal location;

    identifying captions, wherein the step of identifying captions further comprises identifying an association between a non-text region of the page and the caption identified; and

    determining boundaries of at least one column on the page, wherein the step of determining boundaries of at least one column on the page further comprises the steps ofidentification of text regions,grouping of text regions on the page into sections,grouping of the text regions within a section into the at least one column,determining a column width for the at least one column on the page, anddetermining, using the column width, the number of columns in each section.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×