×

System for searching a corpus of document images by user specified document layout components

  • US 5,999,664 A
  • Filed: 11/14/1997
  • Issued: 12/07/1999
  • Est. Priority Date: 11/14/1997
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for searching a corpus of document images stored in a memory, comprising the steps of:

  • segmenting each document image in the corpus of document images into a first set of layout objects;

    each layout object in the first set of layout objects being one of a plurality of layout object types;

    each of the plurality of layout object types identifying a structural element of a document;

    for each segmented document image, computing attributes for each layout object in the first set of layout objects;

    the computed attributes of each layout object having values that quantify properties of a structural element and identify spatial relationships with other segmented layout objects;

    providing a program interface for composing a routine that includes a sequence of selection operations for identifying certain of the layout objects in the first set of layout objects of an example document image selected from the corpus of document images;

    the certain layout objects defining a feature of the example document image;

    executing the sequence of selection operations of the routine for identifying the feature of the example document image in ones of the document images in the corpus of document images;

    for each segmented document image, the sequence of selection operations receiving as input the first set of layout objects and the computed attributes to produce as output a second set of layout objects;

    said executing step identifying the ones of the document images in the corpus of document images that include at least one layout object in the second set of layout objects as having the feature of the example document image; and

    displaying at the program interface the ones of the document images in the corpus of document images identified by said executing step.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×