×

Data organization and access for mixed media document system

  • US 9,171,202 B2
  • Filed: 07/31/2006
  • Issued: 10/27/2015
  • Est. Priority Date: 08/23/2005
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for accessing information in a mixed media document system, the method comprising:

  • receiving an image patch of a target document;

    determining, with one or more processors, from the received image patch, a query that indicates a two-dimensional geometric relationship between a pair of document features in the target document, the two-dimensional geometric relationship including an indication that the pair of document features in the target document are a horizontally adjacent pair of document features or a vertically adjacent pair of document features;

    comparing, with the one or more processors, the query to an index table of document features from mixed media documents to identify candidate regions in the mixed media documents that comprise the query, the index table comprising locations of the document features in the mixed media documents; and

    responsive to comparing the query to the document features in the index table, identifying one or more of the mixed media documents comprising the identified candidate regions comprising the query by;

    adding a weight to an array of an accumulator for each cell in a zone around each pair of document features based on an inverse document frequency associated with each pair of document features, the inverse document frequency being inversely proportional to a number of document pages that contain the image patch;

    searching the array of the accumulator for a cell with a maximum value; and

    in response to the maximum value exceeding a threshold, reporting coordinates of the cell as a location of the image patch.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×