×

EFFICIENT NEAR-DUPLICATE DATA IDENTIFICATION AND ORDERING VIA ATTRIBUTE WEIGHTING AND LEARNING

  • US 20110069833A1
  • Filed: 09/14/2009
  • Published: 03/24/2011
  • Est. Priority Date: 09/12/2007
  • Status: Abandoned Application
First Claim
Patent Images

1. A method of reducing redundancy and increasing processing throughput of an archiving process, comprising the steps of:

  • (a) providing an input data set having a plurality of data elements and/or files;

    (a) detecting exact duplicate and approximately duplicate data elements or files that are either exactly similar or most likely similar; and

    (b) storing references and/or differences to previously archived data;

    wherein step (b) does not include the step of storing the duplicate or matched pairs of data using a standard compression technique.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×