×

Method and system for compressing data

  • US 8,463,759 B2
  • Filed: 02/12/2008
  • Issued: 06/11/2013
  • Est. Priority Date: 09/24/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for compressing data, comprising:

  • receiving at least one data string, the at least one data string comprising a plurality of substrings, each substring comprising a plurality of characters;

    identifying a first substring in the at least one data string, the first substring comprising a plurality of characters that are the same as a plurality of characters of a second substring in the at least one data string;

    generating, by one or more processors, a refer-back token associated with the second substring, the refer-back token indicating a position of the first substring within a token string, the token string being a compressed version of at least a portion of the at least one data string, the position indicated by the refer-back token expressed as an offset to a position of the refer-back token in the token string, the refer-back token further indicating a length of the first substring within the token string, the refer-back token including a header that specifies a number of bits used to store the offset expressed by the position indicated by the refer-back token;

    placing the first substring and the refer-back token into the token string, the token string allowing the second substring to be reconstructed by accessing the refer-back token, moving to the position in the token string that is indicated by the refer-back token, and reading an amount of data according to the length indicated by the refer-back token;

    identifying a third substring in the at least one data string, the third substring comprising a plurality of characters that are the same as a plurality of characters of a fourth substring in the at least one data string;

    generating a second refer-back token associated with the third substring, the second refer-back token indicating a position of the fourth substring within the token string, the position indicated by the second refer-back token expressed as an offset to a position of the second refer-back token in the token string, the second refer-back token further indicating a length of the fourth substring within the token string, the second refer-back token including a second header that specifies a second number of bits used to store the offset expressed by the position indicated by the second refer-back token, the second number of bits different from the first number of bits; and

    placing the fourth substring and the second refer-back token into the token string, the token string allowing the third substring to be reconstructed by accessing the second refer-back token, moving to the position in the token string that is indicated by the second refer-back token, and reading an amount of data according to the length indicated by the second refer-back token.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×