×

Methods and apparatus to compress datasets using proxies

  • US 7,587,401 B2
  • Filed: 03/10/2005
  • Issued: 09/08/2009
  • Est. Priority Date: 03/10/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method to reduce electronic storage space consumption comprising:

  • executing machine readable instructions within a computing device to;

    obtain a first block of a first computer readable file to be from a buffer in the computing device;

    compute a full proxy of the entire first block, a first half proxy for a first half of the first block, a second half proxy for a second half of the block, and a first quarter proxy, a second quarter proxy, a third quarter proxy, and a fourth quarter proxy for a first quarter of the first block, a second quarter of the first block, a third quarter of the first block, and a fourth quarter of the first block, respectively;

    compare the full proxy to a set of proxies representative of previously stored blocks, at least some of the previously stored blocks being from a second computer readable file, the second computer readable file being different from the first computer readable file;

    when the full proxy of the first block matches a first proxy in the set of proxies, to store a first data structure in the computing device that maps the entire first block to at least a portion of a first previously stored block associated with the first matching proxy without storing the first block;

    when the full proxy of the first block does not match any proxy in the set of proxies, to compare at least one of the first half proxy, the second half proxy, the first quarter proxy, the second quarter proxy, the third quarter proxy or the fourth quarter proxy to the set of proxies representative of previously stored blocks;

    when one of the first half proxy, the second half proxy, the first quarter proxy, the second quarter proxy, the third quarter proxy or the fourth quarter proxy matches a second proxy in the set of proxies, to store a second data structure in the computing device that maps the one of the first half proxy, the second half proxy, the first quarter proxy, the second quarter proxy, the third quarter proxy or the fourth quarter proxy to at least a portion of a second previously stored block associated with the second matching proxy without storing the portion of the first block corresponding to the one of the first half proxy, the second half proxy, the first quarter proxy, the second quarter proxy, the third quarter proxy or the fourth quarter proxy; and

    to obtain a second block of the first computer readable file, the first and second blocks being sequential, wherein when the full proxy for the first block matches a proxy in the set of proxies, the second block does not overlap with the first block, and when none of the full proxy, the first half proxy, the second half proxy, the first quarter proxy, the second quarter proxy, the third quarter proxy or the fourth quarter proxy of the first block matches a proxy in the set of proxies, the second block at least partially overlaps with the first block;

    wherein at least one of the first previously stored block and the second previously stored block is from the second computer readable file, and a first amount of storage space of the computing device used to store the first data structure, the second data structure, the first computer readable file and the second computer readable file by executing the machine readable instructions is less than a second amount of storage space of the computing device required to store the first and second computer readable files without executing the machine readable instructions.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×