Data compression for network transport
First Claim
1. A data compression method, comprising:
- receiving an input stream of data bytes wherein the input stream is divided into packets having no more than m data bytes per packet;
inserting one or more packets in sequential order in a circular history buffer of size n, where n is greater than m;
inserting a packet of size x in the first x bytes of the history buffer when there are less than x bytes remaining at the end of the history buffer after the immediately previous packet is inserted in the history buffer;
searching for a matching string of already processed data bytes that is identical to a current string of bytes;
appending to an encoded data stream each data byte not forming part of such a matching string, such data byte being a literal; and
appending to the encoded data stream a match code that identifies both the location in the input stream of the matching string and the number of data bytes in the matching string.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a method and system for data compression. In a preferred embodiment, an input stream of data bytes are compressed into an encoded stream using an LZ77-based scheme. The preferred method searches for a matching sequence of already processed data bytes that is identical to a current sequence of bytes. Sequences of literals (bytes not forming part of a matching sequence) or match codes (encoded matching sequences) are identified by count values indicating the number of literals or match codes in the sequence. Preferably, the encoded stream is transmitted from a first computer to a second computer, where the encoded stream is decompressed. The method uses matching circular history buffers for compression and decompression, the history buffers being synchronized using a coherency byte included with each frame of encoded data transmitted. If an encoded frame is not received by the decompression device, the decompression device transmits a flush request to the compression device. The compression device flushes its history buffer in response to the flush request, the flushing step making the previously processed bytes stored in the history buffer incapable of becoming part of a matching sequence. The compression device includes in the coherency code of the next encoded frame a control code indicating whether a flush request has been received by the compression device.
-
Citations
37 Claims
-
1. A data compression method, comprising:
-
receiving an input stream of data bytes wherein the input stream is divided into packets having no more than m data bytes per packet; inserting one or more packets in sequential order in a circular history buffer of size n, where n is greater than m; inserting a packet of size x in the first x bytes of the history buffer when there are less than x bytes remaining at the end of the history buffer after the immediately previous packet is inserted in the history buffer; searching for a matching string of already processed data bytes that is identical to a current string of bytes; appending to an encoded data stream each data byte not forming part of such a matching string, such data byte being a literal; and appending to the encoded data stream a match code that identifies both the location in the input stream of the matching string and the number of data bytes in the matching string.
-
-
2. A data compression method, comprising:
-
receiving an input stream of data bytes wherein the input stream is divided into packets having no more than m data bytes per packet; inserting one or more packets in sequential order in a circular history buffer of size n, where n is greater than m; providing m-l bytes of overflow storage space immediately after the end of the history buffer; inserting y bytes of a packet of x bytes into the history buffer when there are y bytes of space remaining in the history buffer after the immediatel y previous packet is inserted in the history buffer; inserting x-y bytes of the packet of x bytes into the overflow storage space; and copying the x-y bytes of the packet of x bytes into the first x-y bytes of the history buffer; searching for a matching string of already processed data bytes that is identical to a current string of bytes; appending to an encoded data stream each data byte not forming part of such a matching string, such data byte being a literal; and appending to the encoded data stream a match code that identifies both the location in the history buffer of the matching string and the number of data bytes in the matching string.
-
-
3. A data compression method for compressing a sequence of input strings, comprising:
-
receiving the sequence of input strings; for each of a plurality of input strings of the sequence of input strings; storing the input string in a history buffer of a compression device; encoding the input string based on one or more previously processed matching strings stored in the history buffer; and transmitting the encoded string to a remote decompression device; receiving a flush request from the decompression device if the encoded string is not received correctly by the decompression device; flushing the history buffer in response to the flush request, the flushing step making the input strings previously stored in the history buffer incapable of becoming part of a matching string; encoding input strings after receiving the flush request without reference to any input strings encoded before receiving the flush request; and
;transmitting the encoded input strings to the decompression device. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A data compression method, comprising:
-
receiving an input stream of data bytes; searching for a matching string of already processed data bytes that is identical to a current string of data bytes; appending to an encoded data stream each data byte not forming part of such a matching string, such data byte being a literal; appending to the encoded data stream a match code that identifies both the location in the input stream of the matching string and the number of data bytes in the matching string; and identifying a sequence of match codes in the encoded data stream and appending to the identified match code sequence a count value indicating the number of match codes in the identified match code sequence. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
20. A data compression method, comprising:
-
storing an input stream of data bytes in a circular history buffer of a compression device; searching the history buffer for a matching string of already processed data bytes that is identical to a current string of bytes; appending to an encoded data stream each data byte not forming part of an indentical string, such data byte being a literal; appending to the encoded data stream a match code that identifies both the location in the history buffer of the matching string and the number of data bytes in the matching string; dividing the encoded data stream into a plurality of frames; appending a coherency code to each frame, the coherency code including an encoding code that indicates whether the frame is encoded and if the frame is encoded the coherency code includes a sequence number that indicates a position of the frame within a sequence of encoded frames; transmitting each encoded frame to a remote decompression device having a circular history buffer for decompressing each encoded frame; and synchronizing the history buffer for the decompression device with the history buffer of the compression device using the coherency codes. - View Dependent Claims (19, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A method in a computer system for storing an encoded stream of data, the encoded stream of data including a plurality of literals and matches, the method comprising the steps of:
-
for each literal in the encoded stream, storing the literal; for each match that does not immediately follow a predefined minimum number of matches; storing the match; and storing an indication that the match is stored; and
after a predefined number of matches in sequence have been stored,storing an indication of the number of matches that follow in sequence; and storing each of the matches that follows in sequence without storing an indication that the match is stored.
-
-
29. A method in a computer system for storing an encoded stream of data, the encoded stream of data including a plurality of literals and matches, the method comprising the steps of:
-
for each literal in the encoded stream, storing the literal; for each match that is not part of a sequence of matches, storing an indication that the match is not part of a sequence of matches, and storing the match; and
each sequence of matches;storing an indication of the sequence; storing an indication of the number of matches in the sequence; and storing each match of the sequence. - View Dependent Claims (30)
-
-
31. A method in a computer system for compressing an input data stream of bytes, the computer system having an input buffer for storing the bytes of the input data stream, the input buffer having a beginning and an end, the method comprising the steps of:
-
receiving a plurality of bytes of the input data stream; determining whether there is enough space in the input buffer between a most recently stored byte in the input buffer and the end of the input buffer to store the received plurality of bytes; when there is enough space in the input buffer, storing the received plurality of bytes between the most recently stored byte in the input buffer and the end of the input buffer; when there is not enough space in the input buffer, storing the received plurality of bytes starting at the beginning of the input buffer; searching the input buffer for a string of bytes that matches a current string of bytes; and encoding the current string of bytes as a reference to the string of bytes that matches the current string of bytes.
-
-
32. A method in a computer system for maintaining synchronization of a compression system with a decompression system, the compression system for encoding data with reference to previously encoded data, the method comprising the steps of:
-
under control of the compression system, generating a packet of encoded data; and
transmitting the packet to the decompression system;under the control of the decompression system, when the transmitted packet is not correctly received, transmitting a flush request to the compression system; under control of the compression system, receiving the transmitted flush request; and in response to receiving the transmitted flush request, generating a packet of encoded data without reference to any previously processed packet, the generated packet including a synchronization indicator; and transmitting the generated packet to the decompression system; and under control of the decompression system, when the transmitted packet with the synchronization indicator is received, decompressing the encoded data in the received packet without reference to any previously processed packet; whereby the compression and decompression system maintain synchronization even though a packet is not correctly received by the decompression system.
-
-
33. A method in a computer system for transmitting compressed data from a compression device to a decompression device, comprising:
-
for each of a plurality of input strings, receiving the input string; encoding the received input string based on previously received input strings that match the received input string; and transmitting the encoded input string to the decompression device; receiving an indication that one of the transmitted input strings was not correctly received by the decompression device; and in response to receiving the indication, encoding subsequently received input strings only based on input strings received after receiving the indication so that the decompression device can decompress without reference to input strings that were incorrectly received.
-
-
34. A method in a computer system for decompressing an encoded stream of data, the encoded stream of data including a plurality of literals and matches, the method comprising the steps of:
-
parsing the encoded stream of data into separate portions; determining whether a current portion of the parsed separate portions includes a match code count that indicates the number of matches in a sequence of matches; and if the current portion includes a match code count, then for each match of the sequence of matches, decoding the match into a sequence of values. - View Dependent Claims (35, 36)
-
-
37. A method in a computer system for decompressing encoded data received at a decompression device from a compression device, comprising:
-
receiving a plurality of encoded strings from the compression device, each of the plurality of encoded strings being encoded based on input strings corresponding to previously received encoded strings; transmitting to the compression device an indication that one of the encoded strings was not correctly received by the decompression device; subsequently receiving encoded strings that were encoded by the compression device only based on input strings received after receiving the indication from the decompression device; and decompressing the subsequently received encoded strings without reference to input strings corresponding to encoded strings that were incorrectly received.
-
Specification