Method and apparatus for data compression and decompression
First Claim
1. A method for processing a stream of data symbols into a variable-length compressed code stream comprising:
- first parsing said symbol stream and storing in a table strings of symbols encountered in said stream, said stored strings having respective codes associated therewith, each string of symbols containing more than one symbol including a prefix string of symbols and an extension symbol, said prefix string corresponding to a string stored in said table;
second parsing said symbol stream, said second parsing successively partitioning said symbol stream for identifying codes in the table associated with table strings that correspond to the successive partitioning;
said second parsing step determining said successive partitioning by determining the symbol sequence length of a current symbol sequence in the stream when sequences of at least two different lengths correspond to entries in the table for said current sequence based on determining last encountered symbol positions of corresponding next sequences in the stream that correspond to strings in the table and which start at next symbol positions in the stream after said sequences of at least two different lengths; and
providing the identified codes in said second parsing step for said compressed code stream.
5 Assignments
0 Petitions
Accused Products
Abstract
A data compression technique for encoding a data stream of symbols processes a current sequence of symbols in the stream using an adaptive look-up table containing strings of different lengths by identifying whether a match of a longest string from the table or a shorter string would produce a lower compression ratio by "looking ahead" and identifying a match between a next sequence in the data stream and strings in the table. The compression technique identifies those situations where it is advantageous to use a compressed code identifier for a current sequence that is shorter than the longest matching sequence. Such situations exist when the compression ratio achieved for compressing the current sequence in combination with the next sequence in the stream is greater when other than the longest match is employed for the first sequence. A corresponding decompression technique to decode data encoded using a similar "look ahead" technique is also disclosed.
49 Citations
21 Claims
-
1. A method for processing a stream of data symbols into a variable-length compressed code stream comprising:
-
first parsing said symbol stream and storing in a table strings of symbols encountered in said stream, said stored strings having respective codes associated therewith, each string of symbols containing more than one symbol including a prefix string of symbols and an extension symbol, said prefix string corresponding to a string stored in said table; second parsing said symbol stream, said second parsing successively partitioning said symbol stream for identifying codes in the table associated with table strings that correspond to the successive partitioning; said second parsing step determining said successive partitioning by determining the symbol sequence length of a current symbol sequence in the stream when sequences of at least two different lengths correspond to entries in the table for said current sequence based on determining last encountered symbol positions of corresponding next sequences in the stream that correspond to strings in the table and which start at next symbol positions in the stream after said sequences of at least two different lengths; and providing the identified codes in said second parsing step for said compressed code stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for processing a stream of data symbols into a variable-length compressed code stream comprising:
-
first parsing said symbol stream and storing in a table strings of symbols encountered in said stream, said stored strings having respective codes associated therewith, each string of symbols containing more than one symbol including a prefix string of symbols and an extension symbol, said prefix string corresponding to a string stored in said table; and second parsing said symbol stream, said second parsing encountering an i-th symbol of said stream after or simultaneously with said first parsing encountering a symbol after a j-th symbol of said stream, wherein i<
j, said second parsing comprising,identifying a symbol sequence beginning with a k-th symbol between the i-th and the j-th symbol which corresponds to a previously stored string in the table and which extends to a relatively later encountered symbol in the stream than at least one other symbol sequence corresponding to a previously stored string in the table that starts from a symbol between the i-th to thej-th encountered symbol other than the k-th symbol, identifying the code in the table corresponding to the symbol sequence from the i-th symbol to the symbol immediately prior to the identified k-th symbol, and providing the identified code for part of said compressed code stream. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A storage medium manufactured in accordance with a process comprising the steps of:
-
first parsing a stream of data symbol and storing in a table strings of symbols encountered in said stream, said stored strings having respective codes associated therewith, each said string of symbols including a prefix string of symbols and an extension symbol, said prefix string corresponding to a string stored in said table; second parsing said symbol stream, said second parsing encountering an i-th symbol of said stream after or simultaneously with said first parsing encountering a symbol after a j-th symbol of said stream, wherein i<
j, said second parsing comprising,identifying a symbol sequence beginning with a k-th symbol encountered in the stream from the i-th to thej-th symbol which corresponds to a previously stored string in the table and which extends to a relatively later encountered symbol in the stream than at least one other symbol sequence corresponding to a previously stored string in the table that starts from a symbol between the i-th to thej-th encountered symbol other than the k-th symbol, identifying the code in the table corresponding to the symbol sequence from the i-th symbol to the symbol immediately prior to the identified k-th symbol, and providing the identified code for part of a corresponding compressed code stream; applying a recording signal to the storage medium, the recording signal representing said codes of said compressed code stream; and recording the recording signal onto the storage medium. - View Dependent Claims (18, 19)
-
-
20. A method for recovering a stream of data symbols by processing a compressed code stream comprising:
sequentially processing codes encountered in said compressed code stream, said sequentially processing of each code comprising, identifying whether a string entry is present in a table of strings for said code, said table having initial entries of strings of a set of symbols used to generate said stream of symbols, said stored strings having respective codes associated therewith, if said entry is present in said table, a) retrieving the string entry from said table corresponding to said code, b) concatenating the retrieved string to said stream of symbols and c) storing a new string entry in said table with a respective code associated therewith, said new entry corresponding to a shortest sequence in said generated symbol stream not in said table, and if said entry is not present in said table, a) concatenating said immediate previous retrieved string to the end of the stream of symbols at least once to produce an expanded stream of symbols, b) identifying a shortest symbol sequence in said generated expanded symbol stream not in said table, c) storing a new string entry in said table with a respective code associated therewith corresponding to said identified shortest symbol sequence, and d) amending said expanded stream of symbols so as to contain said symbol stream prior to expansion concatenated with said new string entered in said table. - View Dependent Claims (21)
Specification