×

Systems and methods for efficient data searching, storage and reduction

  • US 8,275,755 B2
  • Filed: 03/19/2009
  • Issued: 09/25/2012
  • Est. Priority Date: 09/15/2004
  • Status: Active Grant
First Claim
Patent Images

1. A method enabling lossless data reduction comprising:

  • partitioning version data into;

    a) data corresponding to data already stored in a repository; and

    b) data not already stored in the repository;

    wherein the data already stored in the repository comprise a plurality of repository chunks, wherein the version data comprise a plurality of version chunks,the method further comprising;

    storing in an index a plurality of n repository distinguishing characteristics (RDCs) and a position in the repository of each of the plurality of repository chunks, where n is smaller than size m of the repository chunk, where m is a value representative of a number of bytes of the repository chunk; and

    for each version chunk;

    determining a plurality of k input distinguishing characteristics (IDCs) of the version chunk, where k is greater than or equal to n;

    determining whether a similar repository chunk exists based on a plurality of matching distinguishing characteristics in the version chunk and similar repository chunk, wherein the similarity determination includes searching for each of the k distinguishing characteristics of the version chunk in the index until at most n matches are found;

    determining that one or more similar repository chunks exist where the number of matches satisfies a threshold;

    determining differences between the version chunk and similar repository chunk by comparing full data of the respective chunks; and

    storing the differences in the repository.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×