×

Extracting information from symbolically compressed document images

  • US 6,658,151 B2
  • Filed: 04/08/1999
  • Issued: 12/02/2003
  • Est. Priority Date: 04/08/1999
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method comprising:

  • representing an input document image as a symbolically compressed representation with a sequence of template identifiers;

    replacing the template identifiers with alphabet characters according to language statistics to generate a text string representative of text in the input document image; and

    extracting conditional n-gram indexing terms from the text string by selecting alphabet characters in the text stream that satisfy a predicate that indicates a subset of combinations of characters in the text string.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×