×

Image-based document indexing and retrieval

  • US 7,475,061 B2
  • Filed: 01/15/2004
  • Issued: 01/06/2009
  • Est. Priority Date: 01/15/2004
  • Status: Active Grant
First Claim
Patent Images

1. A machine-implemented system for document retrieval and/or indexing comprising:

  • a processor;

    a component that receives a captured image of at least a portion of a physical document;

    a search component that locates a match to the physical document, the search is performed over word-level topological properties of generated images, the word-level topological properties comprise at least respective widths of words on the generated images, and the generated images being images of at least a portion of one or more electronic documents; and

    a comparison component that iteratively compares a portion of a signature associated with the captured image based at least in part on word-level topological properties with corresponding portions of signatures respectively associated with the generated images based at least in part on word-level topological properties and excludes each generated image whose portion of the signature does not match the portion of the signature of the captured image to facilitate location of a match to the physical document,the portion of the signature associated with the captured image and the portion of the signatures respectively associated with the generated images that are compared become progressively smaller with each iteration, where one or more iterations are performed until a predetermined threshold number of generated images remain,wherein each portion of signature respectively associated with a generated image is a hash table that contains a plurality of table locations where a respective value corresponding to a respective segment of the generated image is entered into a respective table location for each segment of the generated image, andwherein the portion of the signature associated with the captured image is a hash table that contains a plurality of table locations where a respective value corresponding to a respective segment of the captured image is entered into a respective table location for each segment of the captured image.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×