Data compression/decompression method and apparatus
First Claim
1. A system for processing digital input data, said input data being divisible into strings of bits representing symbols, comprising:
- a memory for holding at least first and second strings of symbols from said input data, said first string having a prefix substring of a fixed length;
means for receiving said prefix substring and computing a first string hash value therefrom;
a hash table for receiving said first string hash value and providing one pointer to each of a number of locations in said second string having an associated second string substring of said fixed length with a second string hash value which matches said first string hash value;
means for receiving said first and second strings of symbols from said memory;
means for determining the longest second string substring match of a first string substring including said first string prefix substring, said means for determining including means for finding a second string substring match of a certain length pointed to by a first pointer, and means for performing a symbol comparison of said first string substring with a second string substring pointed to by a second pointer, beginning with a symbol being at least one symbol beyond said certain length; and
means for generating compressed output data responsive to said means for determining.
6 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for compressing digital data uses data which has been previously compressed as a dictionary of substrings which may be replaced in an input data stream. The method and apparatus uses a hash table to take advantage of principles of locality and probability to solve the maximal matching substring problem inherent in this type of compressing apparatus, most of the time. The hash table consists of first-in, first-out (FIFO) collision chains of fixed, uniform numbers of pointers to substrings of data already compressed which potentially match an input substring. A link list is maintained for linking pointers to corresponding potentially matching strings. A companion decompressing method and apparatus receives compressed data from the compressing apparatus and expand that data back to its original form.
-
Citations
22 Claims
-
1. A system for processing digital input data, said input data being divisible into strings of bits representing symbols, comprising:
-
a memory for holding at least first and second strings of symbols from said input data, said first string having a prefix substring of a fixed length; means for receiving said prefix substring and computing a first string hash value therefrom; a hash table for receiving said first string hash value and providing one pointer to each of a number of locations in said second string having an associated second string substring of said fixed length with a second string hash value which matches said first string hash value; means for receiving said first and second strings of symbols from said memory; means for determining the longest second string substring match of a first string substring including said first string prefix substring, said means for determining including means for finding a second string substring match of a certain length pointed to by a first pointer, and means for performing a symbol comparison of said first string substring with a second string substring pointed to by a second pointer, beginning with a symbol being at least one symbol beyond said certain length; and means for generating compressed output data responsive to said means for determining. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for processing digital input data, said input data being divisible into strings of bits representing symbols, comprising:
-
a memory for holding at least first and second strings of symbols from said input data, said first string having a prefix substring of a fixed length; means for receiving said prefix substring and computing a first string hash value therefrom; a hash table for receiving said first string hash value and providing a pointer to a location in said second string having an associated second string substring of said fixed length with the same hash value; a link list including a link between each pointer which points to a location having an associated second string substring of said fixed length with a same second string hash value; means for receiving said first and second strings of symbols from said memory; means for determining the longest second string substring match of a first string substring including said first string prefix substring, said means for determining including means for comparing said first string substring with a selected limited number of second string substrings, a fixed length of which has a same hash value as that of said first string hash value; and means for generating compressed output data responsive to said means for determining. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for processing digital input data, said input data being divisible into symbols, comprising the steps of:
-
receiving input data, including a first string having a prefix substring of a fixed length being less than a length of said first string; computing a hash value of said prefix substring; storing pointers in a hash table to each location in a second string; generating a link list for linking pointers together in said hash table which point to locations having fixed lengths which have a same hash value; comparing said first string with a selected limited number of substrings in said second string having linked pointers and fixed length hash values which match that of said prefix substring hash value; and generating compressed output data based upon said comparison. - View Dependent Claims (20, 21, 22)
-
Specification