×

Systems and methods for retrieving tabular data from textual sources

  • US 5,950,196 A
  • Filed: 07/25/1997
  • Issued: 09/07/1999
  • Est. Priority Date: 07/25/1997
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for identifying tables and their component fields, the tables embedded in a text document, the method comprising the steps of:

  • (a) storing in a memory element a character alignment graph, the graph indicating the number of text characters appearing in a particular horizontal location for each of a predetermined number of contiguous lines of text in the text document;

    (b) identifying one of the predetermined number of contiguous lines as belonging to a table when the indication of the number of text characters appearing in a particular horizontal location for that predetermined number of contiguous lines fall below a predetermined threshold;

    (c) forming an extracted table from all of the identified predetermined numbers of contiguous lines; and

    (d) identifying one or more captions for the extracted table on the basis of structural patterns contained in the extracted table.

View all claims
  • 15 Assignments
Timeline View
Assignment View
    ×
    ×