×

Compressed document matching

  • US 6,928,435 B2
  • Filed: 01/25/2002
  • Issued: 08/09/2005
  • Est. Priority Date: 11/03/1998
  • Status: Expired due to Fees
First Claim
Patent Images

1. An apparatus for determining if a query document matches one or more documents in a database, the apparatus comprising:

  • means for identifying up endpoints and down endpoints in the query document, the up endpoints representing tops of features in the query document and the down endpoints representing bottoms of features in the query document;

    means for generating a set of descriptors for the query document based on locations of the up endpoints and the down endpoints;

    means for comparing the set of descriptors for the query document against respective sets of descriptors associated with the one or more documents in the database to determine if the query document matches at least one of the one or more documents;

    wherein the means for generating a set of descriptors for the query document based on locations of the up endpoints and the down endpoints comprises means for identifying text lines in the query document based on concentrations of up endpoints and down endpoints along scanlines of the query document; and

    means for generating the set of descriptors based on distances between selected up endpoints and selected down endpoints within the text lines in the query document; and

    wherein the means for identifying text lines in the document based on concentrations of up endpoints and down endpoints along scanlines of the document comprises;

    means for determining the number of up endpoints and the number of down endpoints that lie on each of the scanlines; and

    means for identifying respective pairs of scanlines that have a local maximum number of up endpoints and a local maximum number of down endpoints as text lines.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×