×

Generating signatures over a document

  • US 7,434,058 B2
  • Filed: 06/07/2004
  • Issued: 10/07/2008
  • Est. Priority Date: 06/07/2004
  • Status: Active Grant
First Claim
Patent Images

1. A method for generating a plurality of signatures over a document, the method comprising:

  • extracting content from the document;

    normalizing the extracted content;

    generating the plurality of signatures using the normalized content; and

    tokenizing the normalized content prior to generating the plurality of signatures;

    wherein tokenizing the normalized content comprises creating an ordered list of tokens by converting each delimited item in the normalized content into a token;

    wherein generating the plurality of signatures comprises;

    i) selecting M consecutive tokens from the ordered list of tokens,M being a positive integer;

    ii) selecting N special tokens from the M consecutive tokens, N being a positive integer not greater than M;

    iii) generating a signature by calculating a hash over the N special tokens;

    iv) skipping ahead P tokens from the first of the M consecutive tokens, P being a positive integer; and

    v) repeating i), ii), iii), and iv) until the plurality of signatures have been generated.

View all claims
  • 14 Assignments
Timeline View
Assignment View
    ×
    ×