Data compression method
First Claim
Patent Images
1. A method for data compression of individual sequences or strings of symbols arranged in a data stream, comprising the steps of:
- initializing a dictionary consisting of a set of strings with an index for each of said strings and including all possible strings of length l;
setting a current input position at the beginning of said data stream and repeating the following steps until the data stream to be compressed is exhausted;
determining a longest string S in said dictionary which matches a current string in the data stream starting from the current input position;
generating an identifier I for S consisting of an encoding of the index associated with said longest matched string S;
advancing the current input position to immediately after said current string in the data stream;
modifying said dictionary based on the preceding longest matched string S, the immediately succeeding symbols in the next string in the data stream, and the sequence of previously matched strings;
transmitting I to a utilization device; and
decoding I at said utilization device to recover said string S.
0 Assignments
0 Petitions
Accused Products
Abstract
Communications between a Host Computing System and a number of remote terminals is enhanced by a data compression method which modifies the data compression method of Lempel and Ziv by addition of new character and new string extensions to improve the compression ratio, and deletion of a least recently used routine to limit the encoding tables to a fixed size to significantly improve data transmission efficiency.
263 Citations
18 Claims
-
1. A method for data compression of individual sequences or strings of symbols arranged in a data stream, comprising the steps of:
-
initializing a dictionary consisting of a set of strings with an index for each of said strings and including all possible strings of length l; setting a current input position at the beginning of said data stream and repeating the following steps until the data stream to be compressed is exhausted; determining a longest string S in said dictionary which matches a current string in the data stream starting from the current input position; generating an identifier I for S consisting of an encoding of the index associated with said longest matched string S; advancing the current input position to immediately after said current string in the data stream; modifying said dictionary based on the preceding longest matched string S, the immediately succeeding symbols in the next string in the data stream, and the sequence of previously matched strings; transmitting I to a utilization device; and decoding I at said utilization device to recover said string S. - View Dependent Claims (6, 7, 8, 9)
-
-
2. A method for data compression of individual sequences or strings of characters in a data stream comprising the steps of:
-
initializing a set of strings into a dictionary consisting of n strings each with an identifier and including all possible strings of length l; setting a current input position at the beginning of said data stream; determining the longest string S in the dictionary which matches the current string of characters of the data stream starting at the current input position; finding the identifier I for S; transmitting I to a utilization device; decoding I at said utilization device to recover said string S; adding a new string S'"'"' to said dictionary where S'"'"' comprises a concatenation of a previous string match and said current string match; generating and assigning an identifier I'"'"' to string S'"'"'; advancing the current input position to a next character in said stream following the current matched string; and repeating the above determining, finding, transmitting, decoding, adding, generating and assigning, and advancing steps for a next string until the data stream to be compressed is exhausted. - View Dependent Claims (3)
-
-
4. A method for creating a dynamic dictionary of fixed size to be used in achieving data compression of individual sequences or strings of symbols in a data stream, comprising the steps of:
-
initializing a set of strings to consist of n sequences of symbols including all possible strings of length l; providing a dictionary of fixed size in storage containing said initialized set of strings each with an identifier; determining a longest string S of the set which matches a current string of the data stream to be compressed; testing said dictionary of fixed size in storage containing said set of strings for an empty slot to store a new matched string; deleting a least recently used string from said dictionary, if no empty slot is found, to create an empty slot; and assigning a slot identifier j to said empty slot found or created from the above steps of testing and deleting to identify a new matched string stored therein.
-
-
5. A method for data compression of individual sequences in a data stream, comprising the steps of:
- initializing a set of strings to consist of n sequences;
determining a longest string S of the set which matches a current string;
generating an identifier I for S;
transmitting I to a utilization device;
testing a dictionary in storage containing said set of strings for an empty slot;
deleting a least recently used string from said dictionary if no empty slot is found to create an empty slot;
assigning slot identifier j to said empty slot found or created from the above steps of testing and deleting;
adding a new string S'"'"' to said set where S'"'"' comprises a concatenation of a previous string match and said current string match;
assigning an identifier k to string S'"'"';
advancing the input position to a next character in said stream;
outputting an identifier m to indicate a match; and
repeating the above steps for a next string.
- initializing a set of strings to consist of n sequences;
-
10. A system for data compression of individual sequences or strings of symbols arranged in a data stream, comprising:
-
means for initializing a dictionary consisting of a set of strings with an index for each of said strings and including all possible strings of length l; means for setting a current input position at the beginning of said data stream; means for determining the longest string S in said dictionary which matches a current string in the data stream starting from the current input position; means for generating an identifier I for S consisting of an encoding of the index associated with said longest matched string S; means for advancing the current input position to immediately after said current string in the data stream; means for modifying said dictionary based on the preceeding longest matched string S, the immediately succeeding symbols in the next string in the data stream, and the sequence of previously matched strings; means for transmitting I to a utilization device; means for decoding I at said utilization device to recover said string S; and means for repeatedly activating said determining, generating, advancing, modifying, transmitting, and decoding means until the data stream to be compressed is exhausted. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A system for data compression of individual sequences or strings of characters in a data stream, comprising:
-
means for initializing a set of strings into a dictionary consisting of n strings each with an identifier and including all possible strings of length l; means for setting a current input position at the beginning of said data stream; means for determining the longest string S in the dictionary which matches the current string of characters of the data stream starting at the current input position; means for finding the identifier I for S; means for transmitting I to a utilization device; means for decoding I at said utilization device to recover said string S; means for adding a new string S'"'"' to said dictionary, where S'"'"' comprises a concatenation of said current string match and at least one of a previous string match and an immediately succeeding character in said data stream; means for generating and assigning an identifier I'"'"' to string S'"'"'; means for advancing the current input position to a next character in said stream following the current matched string; means for repeatedly actuating said determining, finding, transmitting, decoding, adding, generating and assigning, and advancing means to operate on a next string until the data stream to be compressed is exhausted. - View Dependent Claims (16)
-
-
17. A system for creating a dynamic dictionary of fixed size to be used in achieving data compression of individual sequences or strings of symbols in a data stream, comprising:
-
means for initializing a set of strings to consist of n sequences of symbols including all possible strings of length l; means for determining a longest string S of the set which matches a current string of the data stream to be compressed; means for storing a dictionary of fixed size containing said intialized set of strings each with an identifier; means for testing said dictionary for an empty slot to store a new matched string; means for deleting a least recently used string from said dictionary, if no empty slot is found, to create an empty slot; and means for assigning a slot identifier j to said empty slot found or created following the operation of said testing and deleting means to identify a new matched string stored therein.
-
-
18. A system for data compression of individual sequences or strings in a data stream, comprising:
-
means for initializing a set of strings to consist of n sequences; means for determining a longest string S of the set which matches a current string to be compressed; means for generating an identifier I for S; means for transmitting I to a utilization device; storage means comprising a dictionary containing said set of strings; means for testing said dictionary in storage for an empty slot; means for deleting a least recently used string from said dictionary, if no empty slot is found, to create an empty slot; means for assigning slot identifier j to said empty slot found or created by said testing and deleting means; means for adding a new string S'"'"' to said set where S'"'"' comprises a concatenation of a previous string match and said current string match; means for assigning an identifier k to said string S'"'"'; means for advancing the input position to a next character in said stream; means for outputting an identifier m to indicate a match; and means for repeatedly actuating said foregoing means for a next string.
-
Specification