×

Natural-language information processor with association searches limited within blocks

  • US 6,374,242 B1
  • Filed: 09/29/1999
  • Issued: 04/16/2002
  • Est. Priority Date: 09/29/1999
  • Status: Expired due to Term
First Claim
Patent Images

1. An information extracting block finding method for determining the structure of documents represented by one-dimensional text files, said method comprising the steps of:

  • extracting from said text files at least some symbols representing two-dimensional spatial information;

    using said spatial information, at least temporarily storing said text files in a memory having a two-dimensional structure defining multiple grid cells;

    for at least some of said grid cells, examining at least two grid cells orthogonally adjacent to the grid cell under examination, and assigning from 0 to 4 of (a) left, (b) right, (c) top, and (d) bottom edge attributes to those boundaries between the grid cell under examination in which one of (a) the grid cell under examination includes a text symbol and the adjacent cell to the left lacks a text symbol, (b) the grid cell under examination includes a text symbol and the adjacent cell to the right lacks a text symbol, (c) the grid cell under examination includes a text symbol and the adjacent cell above the cell under examination lacks a text symbol, and (d) the grid cell under examination includes a text symbol, and the adjacent cell below the cell under examination lacks a text symbol, respectively, to thereby generate a list of cell edges, each defined by its edge attribute and its end locations;

    combining cell edges having the same left, right, top, or bottom edge attributes and an identical end location, to thereby form left, right, top and bottom block edges, respectively, defined by at least one of said left, right, top, and bottom attributes, and having locations defined by their end points;

    associating each top and bottom block edge with those left and right edges having common end points therewith, to form closed two-dimensional regions; and

    determining the spatial coordinates of a bounding box about each of said closed two-dimensional regions.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×