Method and apparatus for enhanced decompressor parsing
First Claim
Patent Images
1. A data compression/decompression system comprising:
- a compression engine for receiving uncompressed information units comprising raw characters and working strings and compressing them into compressed information units, said compressor engine having an associated dictionary table including entries comprising raw characters and reference strings, each reference string representing matched sequence of two or more bytes of data from an input working string; and
, a decompression engine for receiving compressed information units and generating said uncompressed information units in decompression engine cycles, said decompression engine including a parser device for extracting consecutive data phrases of compressed information units, each data phrase comprising one of a predetermined set of characters, compressed strings or combinations thereof, wherein a data phrase includes a variable bit-length compression code for representing a matched sequence of bytes defining a compressed string, each variable bit-length compression code including a primary length field and a secondary length field for defining a string'"'"'s length, and a position length field for encoding a starting address of the reference string in said dictionary to which a starting position of a working string refers, said parser device parsing said primary length fields and secondary length fields in different decompression engine cycles to enhance parser device throughput.
1 Assignment
0 Petitions
Accused Products
Abstract
In a high-speed computer system using multiple compression and decompression engines, a method and apparatus for coding and parsing compressed data in the decompressor in order to avoid bottlenecks within the decompressor that prevent it from achieving optimum latency and throughput acceptable to the system processor.
-
Citations
46 Claims
-
1. A data compression/decompression system comprising:
-
a compression engine for receiving uncompressed information units comprising raw characters and working strings and compressing them into compressed information units, said compressor engine having an associated dictionary table including entries comprising raw characters and reference strings, each reference string representing matched sequence of two or more bytes of data from an input working string; and
,a decompression engine for receiving compressed information units and generating said uncompressed information units in decompression engine cycles, said decompression engine including a parser device for extracting consecutive data phrases of compressed information units, each data phrase comprising one of a predetermined set of characters, compressed strings or combinations thereof, wherein a data phrase includes a variable bit-length compression code for representing a matched sequence of bytes defining a compressed string, each variable bit-length compression code including a primary length field and a secondary length field for defining a string'"'"'s length, and a position length field for encoding a starting address of the reference string in said dictionary to which a starting position of a working string refers, said parser device parsing said primary length fields and secondary length fields in different decompression engine cycles to enhance parser device throughput. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A data decompression system for decompressing information units received in compressed form, said data decompression system comprising:
-
a parser device for extracting consecutive data phrases of compressed information units at a parser device engine rate, each data phrase comprising one of a predetermined set of characters, compressed strings or combinations thereof;
a tokenizer device associated with said parser device for receiving said data phrases and converting each data phrase into fixed bit-length tokens of shortened bit-length, each token including an indicator for identifying its corresponding phrase as comprising one of a raw character or a string; and
,a decompression engine for receiving said tokens and generating corresponding uncompressed information units for output thereof at a predetermined data bus rate, whereby said parser device, tokenizer device and decompression engine operate at engine rates equal to or greater than said predetermined data bus rate for maximizing decompression throughput and reducing decompression time. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A method for decompressing compressed data information units generated by a data compression engine, said method comprising the steps of:
-
a) parsing said compressed information units and extracting consecutive data phrases at a parsing engine rate, each data phrase comprising one of a predetermined set of characters, compressed strings or combinations thereof;
b) converting each extracted data phrase into fixed bit-length tokens of shortened bit-length, each said token including an indicator for identifying its corresponding phrase as comprising one of a compressed raw character or a compressed string;
c) implementing a decompression engine for receiving said fixed bit-length tokens; and
,d) processing said tokens at a decompression engine rate for outputting corresponding uncompressed information units at a predetermined data bus rate, wherein said parsing engine rate, tokenizing rate and decompression engine rate are equal to or greater than said predetermined data bus rate for maximizing decompression throughput and reducing decompression time. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46)
implementing a cooperative dictionary table associated with said compression engine; and
,forming entries in said dictionary table comprising raw characters and reference strings, each reference string representing matched sequence of two or more bytes of data from an incoming working string of uncompressed information units to be compressed.
-
-
36. The method as claimed in claim 35, wherein a data phrase comprises either one of a compressed code type data of fixed bit-length for representing a compressed character, and a compressed code type data of variable bit-length type for representing a sequence of bytes defining a compressed string, said method including representing a string'"'"'s length in each variable bit-length type as a primary length field and a secondary length field and a position length field for encoding a starting address of the reference string in said dictionary to which a starting position of a working string refers.
-
37. The method as claimed in claim 36, wherein said representing step comprises the steps of:
-
determining statistically probable matching strings comprising three bytes of less, and forming a corresponding data phrase for said strings to include a primary length field comprising data of minimum bit length and a position length field, whereby said parsing step for such statistically probable matching strings is performed in a single decompression engine cycle.
-
-
38. The method as claimed in claim 36, wherein for matching strings comprising more than three bytes, the step of forming a corresponding data phrase for said strings to include a primary length field, secondary length fields and a position length field, whereby said parsing step includes parsing said primary length fields and secondary length fields of a data phrase in different decompression engine cycles.
-
39. The method as claimed in claim 37, wherein said minimum bit length comprises a total of two bits for identifying said string length code.
-
40. The method as claimed in claim 37, further said parsing step includes:
- differentiating a data phrase of compressed code type data or compressed string code type according to a single flag bit.
-
41. The method as claimed in claim 36, wherein said converting step b) includes converting a data phrase representing a string of two or more bytes into consecutive position tokens, said position tokens based on a string position and length fields.
-
42. The method as claimed in claim 41, wherein said converting step b) includes providing an identifier in each token for enabling a decompression engine to distinguish between consecutive position tokens belonging to a single stream, and consecutive position tokens belonging to different streams.
-
43. The method as claimed in claim 42, wherein prior to step d) the step of temporarily storing generated tokens before transferring them to said decompressor engine, whereby said continued parsing steps are enabled during occurrence of a decompressor stall cycles.
-
44. The method as claimed in claim 43, wherein each data phrase of predetermined set of characters, compressed strings or combinations thereof optimally meet and sustain predetermined data bus rate requirements.
-
45. The method as claimed in claim 44, wherein a first optimum set of data phrases includes phrases that are able to be converted into two or more tokens per decompression engine cycle.
-
46. The method as claimed in claim 45, wherein a second optimal set of data phrases includes phrases representing two raw characters, a single raw character, a string'"'"'s primary length field, and a string'"'"'s secondary length field.
Specification