Method and apparatus for data compression having an improved matching algorithm which utilizes a parallel hashing technique
First Claim
1. A matching method useful for digital data compression operations which compresses an input data stream into a compressed output data stream based on a predetermined minimum matching length, where the input data stream comprises a plurality of input data subblocks including subblocks of N bytes and N+1 bytes, and the compressed output data stream comprises a plurality of compressed and uncompressed output data subblocks, the matching method comprising the following steps:
- a. initializing a first hash table and a second hash table each having a plurality of entries;
b. obtaining a current source pointer for a current input data subblock;
c. computing an address for said first hash table on an input data subblock of N bytes, where N is an integer;
d. exchanging said current source pointer with an entry of said first hash table at said address computed in step (c);
e. saving said entry of said first hash table;
f. computing an address for said second hash table on an input data subblock of N+1 bytes;
g. exchanging said current source pointer with an entry of said second hash table at said address computed in step (f);
h. determining a matching length for said entry of said second hash table obtained in step (g);
i. if said matching length determined in step (h) is not less than said predetermined minimum matching length, then encoding said current input data subblock as an compressed output data subblock;
j. if said matching length determined in step (h) is less than said predetermined minimum matching length, then determining a matching length for said entry of said first hash table saved in step (e);
k. if said matching length determined in step (j) is not less than said predetermined minimum matching length, then encoding said current input data subblock as a compressed output data subblock;
l. if said matching length determined in step (j) is less than said predetermined minimum matching length, then encoding said current input data subblock as an incompressible output data subblock; and
m. repeating the steps (b) through (l) until all of said plurality of input data subblocks have been processed.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for digital data compression having an improved matching algorithm which utilizes a parallel hashing technique. The matching algorithm of the present invention data compression method can (a) perform a first hash computation on data string subblocks of N bytes and save the hash table value; (b) perform a second hash computation on a data string subblock of N+1 bytes by using the hash result from (a) which is hashed on the data string subblocks of N bytes; (c) perform a first hash matching on data string subblocks of N+1 bytes; and (d) when the first hash matching on the data string subblock of N+1 bytes found no match, perform a second subsequent parallel hash matching on data string subblocks of N bytes.
-
Citations
20 Claims
-
1. A matching method useful for digital data compression operations which compresses an input data stream into a compressed output data stream based on a predetermined minimum matching length, where the input data stream comprises a plurality of input data subblocks including subblocks of N bytes and N+1 bytes, and the compressed output data stream comprises a plurality of compressed and uncompressed output data subblocks, the matching method comprising the following steps:
-
a. initializing a first hash table and a second hash table each having a plurality of entries; b. obtaining a current source pointer for a current input data subblock; c. computing an address for said first hash table on an input data subblock of N bytes, where N is an integer; d. exchanging said current source pointer with an entry of said first hash table at said address computed in step (c); e. saving said entry of said first hash table; f. computing an address for said second hash table on an input data subblock of N+1 bytes; g. exchanging said current source pointer with an entry of said second hash table at said address computed in step (f); h. determining a matching length for said entry of said second hash table obtained in step (g); i. if said matching length determined in step (h) is not less than said predetermined minimum matching length, then encoding said current input data subblock as an compressed output data subblock; j. if said matching length determined in step (h) is less than said predetermined minimum matching length, then determining a matching length for said entry of said first hash table saved in step (e); k. if said matching length determined in step (j) is not less than said predetermined minimum matching length, then encoding said current input data subblock as a compressed output data subblock; l. if said matching length determined in step (j) is less than said predetermined minimum matching length, then encoding said current input data subblock as an incompressible output data subblock; and m. repeating the steps (b) through (l) until all of said plurality of input data subblocks have been processed. - View Dependent Claims (2)
-
-
3. A matching method useful for digital data compression operations which compresses an input data stream into a compressed output data stream, comprising the following steps:
-
a. obtaining a current source pointer for a current input data subblock and computing an address for a first hash table on an input data subblock of N bytes, where N is an integer; b. exchanging said current source pointer with an entry of said first hash table at said address computed in step (a) and saving said entry of said first hash table; c. computing an address for a second hash table on an input data subblock of N+1 bytes; d. exchanging said current source pointer with an entry of said second hash table at said address computed in step (c); e. determining a matching length for said entry of said second hash table obtained in step (d); f. if said matching length determined in step (e) is less than a predetermined minimum matching length, then determining a subsequent matching length for said entry of said first hash table saved in step (b); and g. encoding said current input data subblock as a compressed output data subblock if said subsequent matching length is greater than or equal to said predetermined minimum matching length. - View Dependent Claims (4, 5, 6, 7)
-
-
8. A matching method useful for digital data compression operations which compresses an input data stream into a compressed output data stream based on a predetermined minimum matching length, where the input data stream comprises a plurality of input data subblocks including subblocks of 3 bytes and 4 bytes, and the compressed output data stream comprises a plurality of compressed and incompressed output data subblocks, the matching method comprising the following steps:
-
a. initializing a first hash table and a second hash table each having a plurality of entries and initializing a source pointer; b. obtaining a current source pointer for a current input data subblock; c. computing an address for said first hash table on an input data subblock of 3 bytes; d. exchanging said current source pointer with an entry of said first hash table at said address computed in step (c); e. saving said entry of said first hash table; f. computing an address for said second hash table on an input data subblock of 4 bytes; g. exchanging said current source pointer with an entry of said second hash table at said address computed in step (f); h. determining a matching length for said entry of said second hash table obtained in step (g); i. if said matching length determined in step (h) is not less than said predetermined minimum matching length, then encoding said current input data subblock as an compressed output data subblock; j. if said matching length determined in step (h) is less than said predetermined minimum matching length, then determining a matching length for said entry of said first hash table saved in step (e); k. if said matching length determined in step (j) is not less than said predetermined minimum matching length, then encoding said current input data subblock as a compressed output data subblock; l. if said matching length determined in step (j) is less than said predetermined minimum matching length, then encoding said current input data subblock as an incompressible output data subblock; and m. repeating the steps (b) through (l) until all of said plurality of input data subblocks have been processed. - View Dependent Claims (9)
-
-
10. A matching method useful for digital data compression operations which compresses an input data stream into a compressed output data stream, where the input data stream comprises a plurality of input data subblocks including subblocks of N bytes and subblocks of N+1 bytes, the matching method comprising the following steps:
-
a. initializing a first hash table and a second hash table each having a plurality of entries; b. obtaining a current source pointer for a current input data subblock; c. hashing on the input data subblocks of N bytes and saving hashing entries, where N is an integer; d. hashing on the input data subblocks of N+1 bytes based on said hashing results of the input data subblocks of N bytes computed in step (c) and saving hashing entries; e. exchanging the current source pointer with the contents of a first source pointer at the hash entry of step (c) and a second source pointer at the hash entry of step (d), and saving the respective first and second hash table source pointers; f. comparing a current input data subblock with data at said first saved source pointer obtained in step (e); g. if matching is found then compressing said current input data subblock, and if no matching is found then comparing said current input data subblock with data at said second saved source pointer in step (e); and h. compressing said current input data subblock if matching is found during said comparing in step (g). - View Dependent Claims (11, 12, 13)
-
-
14. An apparatus useful for digital data compression operations which compresses an input data stream into a compressed output data stream based on a predetermined minimum matching length, where the input data stream comprises a plurality of input data subblocks including subblocks of N bytes and N+1 bytes, and the compressed output data stream comprises a plurality of compressed and incompressed output data subblocks, the apparatus comprising:
-
a. means for initializing a first hash table and a second hash table each having a plurality of entries; b. means for obtaining a current source pointer for a current input data subblock; c. means for computing an address for said first hash table on an input data subblock of N bytes, where N is an integer; d. means for exchanging said current source pointer with an entry of said first hash table at said address computed in step (c); e. means for saving said entry of said first hash table; f. means for computing an address for said second hash table on an input data subblock of N+1 bytes; g. means for exchanging said current source pointer with an entry of said second hash table at said address computed in step (f); h. means for determining a matching length for said entry of said second hash table obtained in step (g); i. means for encoding said current input data subblock as an compressed output data subblock when said matching length determined in step (h) is not less than said predetermined minimum matching length; j. means for determining a matching length for said entry of said first hash table saved in step (e) when said matching length determined in step (h) is less than said predetermined minimum matching length; k. means for encoding said current input data subblock as a compressed output data subblock when said matching length determined in step (j) is not less than said predetermined minimum matching length; l. means for encoding said current input data subblock as an incompressible output data subblock when said matching length determined in step (j) is less than said predetermined minimum matching length; and m. means for repeating the steps (b) through (l) until all of said plurality of input data subblocks have been processed. - View Dependent Claims (15)
-
-
16. An apparatus useful for digital data compression operations which compresses an input data stream into a compressed output data stream, where the input data stream comprises a plurality of input data subblocks including subblocks of N bytes and subblocks of N+1 bytes, the apparatus comprising:
-
a. means for initializing a first hash table and means for initializing a second hash table, each having a plurality of entries; b. means for obtaining a current source pointer for a current input data subblock; c. means for hashing on the input data subblocks of N bytes and means for saving hashing entries, where N is an integer; d. means for hashing on the input data subblocks of N+1 bytes based on said hashing results of the input data subblocks of N bytes computed in step (a) and means for saving hashing entries; e. means for exchanging the current source pointer with the contents of a first source pointer at the hash entry of (c) and a second source pointer at the hash entry of (d), and means for saving the first and second source pointers; f. means for comparing a current input data subblock with data at first saved source pointer obtained in (e); g. means for compressing said current input data subblock when matching is found, and means for comparing said current input data subblock with data at second source pointer saved in step (e) when no matching is found; and h. means for compressing said current input data subblock when matching is found during said comparing in step (g). - View Dependent Claims (17, 18)
-
-
19. A data compression method for compressing an input data stream into a compressed output data stream, where the input data stream comprises a plurality of input data subblocks including subblocks of N bytes and subblocks of N+1 bytes, the method comprising the following steps:
-
a. initializing a first hash table and a second hash table each having a plurality of entries; b. obtaining a current source pointer for a current input data subblock; c. hashing on the input data subblocks of N bytes, hashing on the input data subblocks of N+1 bytes based on the hashing on the input data subblocks of N bytes, saving respective hashing entries, exchanging the current source pointer with the contents of a source pointer at each of the respective first and second hash tables, saving the hash table source pointers, and comparing a current input data subblock with data at said saved source pointer obtained from hashing the input data subblocks of N+1 bytes; d. if no matching is found in step (c) then comparing said current input data subblock with data at saved source pointer obtained from hashing the input data subblocks of N bytes; and e. encoding said current input data subblock as incompressible output data subblock if no matching being found in step (d), and encoding said current input data subblock as compressed output data subblock if matching being found in either step (c) or (b) (d).
-
-
20. A data compression apparatus for compressing an input data stream into a compressed output data stream, where the input data stream comprises a plurality of input data subblocks including subblocks of N bytes and subblocks of N+1 bytes, the apparatus comprising:
-
a. means for initializing a first hash table and means for initializing a second hash table, each having a plurality of entries; b. means for obtaining a current source pointer for a current input data subblock; c. means for hashing on the input data subblocks of N bytes, hashing on the input data subblocks of N+1 bytes based on the hashing on the input data subblocks of N bytes, saving respective hashing entries, means for exchanging the current source pointer with the contents of a source pointer at each of the respective hash tables, means for saving the hash table source pointers and means for comparing a current input data subblock with data at said saved source pointer obtained from hashing the input data subblocks of N+1 bytes; d. means for comparing said current input data subblock with data at said saved source pointer obtained from hashing the input data subblocks of N bytes when no matching is found in step (c); and e. means for encoding said current input data subblock as incompressible output data subblock if no matching is found in step (d), and means for encoding said current input data subblock as compressed output data subblock if matching is found in either steps (c) or (d).
-
Specification