×

System for Document De-Duplication and Modification Detection

  • US 20090192978A1
  • Filed: 02/27/2008
  • Published: 07/30/2009
  • Est. Priority Date: 01/29/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for the organization and collection of documents, comprising:

  • collecting a first document in response to a document collection request;

    generating a first hash code corresponding to a non-metadata portion of the first document;

    comparing the first hash code to a plurality of hash codes, each hash code of the plurality of hash codes corresponding to a non-metadata portion of a corresponding document of a plurality of documents, each document collected in response to the document collection request;

    if the first hash code does not match any hash code of the plurality of hash codes, storing the first documents including a metadata portion and the non-metadata portion, on a data storage; and

    if the first hash code matches any hash code of the plurality of hash codes, extracting metadata corresponding to the first document; and

    storing on the data storage the extracted metadata, but not the non-metadata portion of the first document, in conjunction with the particular document corresponding to the hash code that matches the first hash code.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×