×

Method and apparatus for computer understanding and manipulation of minimally formatted text documents

  • US 5,164,899 A
  • Filed: 05/01/1989
  • Issued: 11/17/1992
  • Est. Priority Date: 05/01/1989
  • Status: Expired due to Term
First Claim
Patent Images

1. An apparatus for analyzing a text document on a computer having a memory, said text document comprising one or more blocks of contiguous text, said blocks of contiguous text having one or more boundaries, said apparatus comprising:

  • digital imaging means for scanning the text document and for converting the text document into a matrix of picture elements representing a digital image of the text document;

    spatial analysis means coupled to the digital imaging means for scanning the matrix of picture elements in two dimensions to identify horizontal and vertical boundaries of the blocks of contiguous text in the digital image;

    character recognition means coupled to the spatial analysis means for converting the blocks of contiguous text into a text file, said text file comprising word patterns and block identifiers;

    a grammar stored in said memory comprising predetermined word patterns and block identifiers;

    extractor means coupled to the memory and character recognition means for matching the word patterns and block identifiers contained in the grammar with the word patterns and block identifiers in the text file; and

    formatting means coupled to the extractor means for generating output in a predefined pattern using the matched word patterns and block identifiers.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×