×

Document fingerprinting with asymmetric selection of anchor points

  • US 8,359,472 B1
  • Filed: 03/25/2010
  • Issued: 01/22/2013
  • Est. Priority Date: 03/25/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented process for generating document fingerprints, the method being performed using a computer including at least a processor, data storage, and computer-readable instructions, and the method comprising:

  • normalizing a document to create a normalized text string;

    applying a first hash function with a sliding hash window to the normalized text string to generate an array of hash values;

    applying a first filter to the array of hash values to select candidate anchoring points;

    applying a second filter to the candidate anchoring points to select anchoring points; and

    applying a second hash function to substrings located at the selected anchoring points to generate hash values for use as fingerprints of the document.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×