Textual substitution data compression with finite length search windows
First Claim
Patent Images
1. In a textual substitution data compressor, the improvement comprisinga finite, substantially fixed length search window for storing recently compressed data symbols;
- a search tree data structure having a predetermined cut-off depth for linking said recently compressed symbols to each other in accordance with the order of their occurrence and for tracking said recently compressed symbols as they advance across said search window, the cut-off depth of said search tree being selected based on a maximum permissible copy length;
means for inserting uncompressed symbols into said search tree for identifying prior occurrences of reoccurring symbol strings and for determining the string length and position within said search window of the longest of said prior occurrences for each of said reoccurring symbol strings; and
means for providing copy codewords and literal codewords with a variable number of appended literal symbols for encoding said uncompressed symbols depending on whether said uncompressed symbols define or fail to define, respectively, reoccurring symbol strings of minimum length;
each of said copy codewords identifying the length and the search window position of the prior occurrence of the reoccurring symbol string it represents; and
each of said literal codewords identifying the number of literal symbols that are appended to it.
4 Assignments
0 Petitions
Accused Products
Abstract
In accordance with the present invention source data is encoded by literal codewords of varying length value, with or without the encoding of copy codewords of varying length and displacement value. Copy codeword encoding is central to textual substitution-style data compression, but the encoding of variable length literals may be employed for other types of data compression.
179 Citations
7 Claims
-
1. In a textual substitution data compressor, the improvement comprising
a finite, substantially fixed length search window for storing recently compressed data symbols; -
a search tree data structure having a predetermined cut-off depth for linking said recently compressed symbols to each other in accordance with the order of their occurrence and for tracking said recently compressed symbols as they advance across said search window, the cut-off depth of said search tree being selected based on a maximum permissible copy length; means for inserting uncompressed symbols into said search tree for identifying prior occurrences of reoccurring symbol strings and for determining the string length and position within said search window of the longest of said prior occurrences for each of said reoccurring symbol strings; and means for providing copy codewords and literal codewords with a variable number of appended literal symbols for encoding said uncompressed symbols depending on whether said uncompressed symbols define or fail to define, respectively, reoccurring symbol strings of minimum length;
each of said copy codewords identifying the length and the search window position of the prior occurrence of the reoccurring symbol string it represents; and
each of said literal codewords identifying the number of literal symbols that are appended to it. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification