Method and apparatus for the compression and decompression of data using Lempel-Ziv based techniques
First Claim
1. A method for parsing and coding a string of a variable number of uncoded data symbols to provide a fixed length codeword, comprising the steps of:
- providing a symbol storage device having predetermined storage locations, said storage locations arranged into a first half and a second half,initializing the storage locations in the first half of said symbol storage device with predetermined data symbols,inserting an uncoded string of data symbols into the storage locations in the second half of said symbol storage device,comparing the data symbols in the storage locations in the second half of said symbol storage device with the predetermined symbols in the storage locations in the first half of said storage device using a plurality of processors, said processors being connected in a systolic array, and each processor in said systolic array of processors comparing selected pairs of data symbols and selectively passing the data symbols to an adjacent processor,providing fixed length outputs from the systolic array of processors indicating;
i) the length of the longest string of data symbols in the first half of the storage device that matches a string of data symbols from the second half of said symbol storage, and ii) the storage location in the first half of said symbol storage device of the starting point for said string,shifting the data symbols of said longest matched string and the data symbol immediately following said longest string from the storage locations in the second half of said symbol storage device into the storage locations in the first half of said symbol storage device in a predetermined sequence and shifting subsequent uncoded data symbols into the storage locations in the second half of said symbol storage device,providing a fixed length output from said symbol storage device indicating the data symbol that immediately follows said longest string,arranging the fixed length outputs from the systolic array of processors and the fixed length output from said symbol storage device in a codeword storage device to provide a codeword having a fixed length.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for compressing and decompressing text and image data by forming fixed-length codewords from a variable number of data symbols. Data symbols are shifted into registers in the first half of a buffer, while an uncompressed string of data symbols are shifted into registers in the second half of the buffer. A systolic array of processors compares each data symbol in the second half of the buffer with each data symbol in the first half of the buffer. Each processor compares pairs of data symbols, and selectively passes the data symbols to an adjacent processor. A fixed-length output is provided indicating the length and the starting point of the longest substring in the first half of the buffer that matches a substring from the second half of the buffer. The matched data symbols in the second half of the buffer and the data symbol immediately following the matched data symbols are then shifted into the first half of the buffer, and uncompressed data symbols are then shifted into the second half of the buffer. A preselected shift register in the first half of the buffer provides a fixed-length output indicating the symbol that immediately follows the last matched data symbol. The length and the starting point information and the last symbol information are assembled to form a codeword having a predetermined length. The codeword is stored in memory and can be later retrieved and decompressed to provide the original string of data symbols.
-
Citations
30 Claims
-
1. A method for parsing and coding a string of a variable number of uncoded data symbols to provide a fixed length codeword, comprising the steps of:
-
providing a symbol storage device having predetermined storage locations, said storage locations arranged into a first half and a second half, initializing the storage locations in the first half of said symbol storage device with predetermined data symbols, inserting an uncoded string of data symbols into the storage locations in the second half of said symbol storage device, comparing the data symbols in the storage locations in the second half of said symbol storage device with the predetermined symbols in the storage locations in the first half of said storage device using a plurality of processors, said processors being connected in a systolic array, and each processor in said systolic array of processors comparing selected pairs of data symbols and selectively passing the data symbols to an adjacent processor, providing fixed length outputs from the systolic array of processors indicating;
i) the length of the longest string of data symbols in the first half of the storage device that matches a string of data symbols from the second half of said symbol storage, and ii) the storage location in the first half of said symbol storage device of the starting point for said string,shifting the data symbols of said longest matched string and the data symbol immediately following said longest string from the storage locations in the second half of said symbol storage device into the storage locations in the first half of said symbol storage device in a predetermined sequence and shifting subsequent uncoded data symbols into the storage locations in the second half of said symbol storage device, providing a fixed length output from said symbol storage device indicating the data symbol that immediately follows said longest string, arranging the fixed length outputs from the systolic array of processors and the fixed length output from said symbol storage device in a codeword storage device to provide a codeword having a fixed length.
-
-
2. A system for parsing and coding a string of a variable number of uncoded data symbols into a codeword having a predetermined maximum length, comprising:
-
a symbol storage device having predetermined storage locations, said storage locations arranged into a first half and a second half, said storage locations in said first half having means to initially receive a string of predetermined data symbols, and said storage locations in said second half having means to receive a string of uncoded data symbols, a plurality of processors connected in a systolic array adapted to selectively receive data symbols from predetermined storage locations in each half of said symbol storage device, each of said processors having means to compare selected pairs of data symbols and means to selectively pass the data symbols to an adjacent processor, said systolic array of processors having means for providing fixed length outputs indicating;
i) the length of the longest string of data symbols from the storage locations in the first half of said symbol storage device that matches a string of data symbols in the storage locations from the second half of said symbol storage device and ii) the location in the first half of said symbol storage device of the starting point for said string,said symbol storage device having means for shifting the data symbols of said longest string and the data symbol immediately following said longest string from the storage locations in the second half of said symbol storage device in a predetermined sequence into the storage locations in the first half of said symbol storage device and means for shifting subsequent uncoded data symbols into the storage locations in the second half of said symbol storage device, said symbol storage device having means for providing a fixed length output indicating the data symbol in the storage location in the first half of the symbol storage device immediately following said longest string, and a codeword storage device receiving said fixed length outputs from said systolic array of processors and said fixed length output from said symbol storage device, said codeword storage device providing a codeword having a predetermined maximum length indicating the length of said longest string, the starting point of said longest string and the data symbol immediately following said longest string. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A system for decompressing a coded string of data symbols forming codewords, comprising:
-
a decompression codeword storage device having predetermined storage locations, said storage locations in said decompression codeword storage device having means to initially receive a string of predetermined data symbols and means to thereafter receive a codeword, said codeword having a length of a predetermined number of bytes with each byte having a predetermined number of bits, wherein one of said bytes of said codeword represents a pointer value, a second of said bytes of said codeword represents a length value, and a third of said bytes of said codeword represents a lastsymbol value, means to check a preselected bit in each byte of said codeword and means to shift a preselected number of locations in said decompression codeword storage device determined by the pointer value of said codeword if said preselected bit indicates that said longest matched string is equal to or greater than three data symbols in length, shifting into said storage locations preselected data symbols represented by the length value of said codeword, and shifting a data symbol into said decompression codeword storage device represented by the lastsymbol of said codeword, providing an output from said decompression codeword storage device indicating the original uncoded string of data symbols.
-
-
24. A method for parsing and coding a string of a variable number of uncoded data symbols to provide a codeword having a predetermined maximum length, comprising the steps of:
-
providing a symbol storage device having predetermined storage locations, said storage locations arranged into a first half and a second half, initializing the storage locations in the first half of the storage device with predetermined data symbols, inserting a string of uncoded data symbols into the storage locations in the second half of the storage device, comparing the data symbols in the storage locations in the second half of said symbol storage device with the data symbols in the storage locations in the first half of said symbol storage device using a systolic array of adjacent processors, providing an output from said systolic array of processors indicating;
i) the length of the longest string of data symbols from the storage locations in the first half of said symbol storage device that matches a string of data symbols in the storage locations from the second half of said symbol storage device and ii) the location in the first half of said symbol storage device of the starting point for said string,shifting the data symbols of said longest string and the data symbol immediately following said longest string from the storage locations in the second half of said symbol storage device into the storage locations in the first half of said symbol storage device in a predetermined sequence and shifting subsequent uncoded data symbols into the storage locations in the second half of the storage device, providing an output from the first half of said symbol storage device indicating the data symbol that immediately follows said longest string, arranging the outputs from the systolic array of processors and the output from said symbol storage device in a codeword storage device, each of said outputs having a length of a predetermined number of bytes, wherein each of said bytes includes a predetermined number of bits, and setting one of said bits of said output from said symbol storage device to a predetermined first value if said longest string is less than three data symbols in length and storing said output from said symbol storage device in a temporary storage device, and setting said one of said bits in each byte of all of the outputs to a predetermined second value if said longest string is equal to or greater than three data symbols in length and storing the outputs from said systolic array of processors and said output from said symbol storage device in said temporary storage buffer.
-
-
25. A method for compressing and decompressing a string of a variable number of uncoded data symbols, comprising the steps of:
-
providing a symbol storage device having predetermined storage locations, said storage locations arranged into a first half and a second half, initializing the storage locations in the first half of the storage device with predetermined data symbols, inserting an uncoded string of data symbols into the storage locations in the second half of the storage device, comparing the data symbols in the storage locations in the second half of said symbol storage device with the data symbols in the storage locations in the first half of said symbol storage device using a systolic array of adjacent processors, said systolic array of processors providing an output indicating;
i) the length of the longest string of data symbols in the storage locations in the first half of said symbol storage device that matches a string of data symbols in the storage locations from the second half of said symbol storage device and ii) the location in the first half of said symbol storage device of the starting point for said string,each of said outputs from said systolic array of processors and said symbol storage device having a length of a predetermined number of bytes, wherein each byte includes a predetermined number of bits, shifting the data symbols of said longest string and the data symbol immediately following said longest string from the storage locations in the second half of said symbol storage device into the storage locations in the first half of said symbol storage device in a predetermined sequence and shifting subsequent uncoded data symbols into the storage locations in the second half of the storage device, providing an output from the first half of said symbol storage device indicating the data symbol immediately following said longest string, arranging the outputs from the systolic array of processors and the output from said symbol storage device in a codeword and storing said codeword in a codeword storage device, each of said outputs having a length of a predetermined number of bytes, wherein each of said bytes includes a predetermined number of bits, setting one of said bits in each byte to a predetermined first value if said longest string is equal to or greater than three data symbols in length, moving said codeword in said codeword storage device to a temporary storage device, selectively retrieving said codeword from said temporary storage device, checking the value of said one bit in each byte of said codeword, shifting a preselected number of locations in a buffer determined by one of said bytes of said codeword if the value of said one bit indicates that said longest string is equal to or greater than three data symbols in length, shifting into said locations preselected data symbols represented by a second one of said bytes of said codeword, and shifting a data symbol into said buffer represented by a third one of said bytes of said codeword, said buffer providing an output indicating the original uncoded data symbols.
-
-
26. A method for decompressing a coded string of data symbols forming fixed-length codewords stored in memory, comprising the steps of:
-
inserting predetermined data symbols into a storage device, selectively retrieving a codeword from the memory, said codeword having a length of a predetermined number of bytes with each byte having a predetermined number of bits, wherein one of said bytes of said codeword represents a pointer value, a second of said bytes of said codeword represents a length value, and a third of said bytes of said codeword represents a lastsymbol value, checking a preselected bit in each byte of said codeword, shifting a preselected number of locations in said storage device determined by the pointer value of said codeword if said preselected bit indicates that said longest string is equal to or greater than three data symbols in length, shifting into said storage locations preselected data symbols represented by the length value of said codeword, and shifting a data symbol into said buffer represented by the lastsymbol of said codeword, providing an output from said symbol storage device indicating the original uncoded string of data symbols. - View Dependent Claims (27)
-
- 28. A method as in claim 28, wherein the value of said preselected bit in each byte is checked by a code checker, said preselected bit being initially set to a first value if said longest string is equal to or greater than three data symbols in length, and initially set to a second value if said longest string is less than three data symbols in length.
-
30. An apparatus for parsing and coding a string of a variable number of uncoded data symbols into a codeword having a predetermined maximum length in a data compression system, comprising:
-
a symbol storage means having predetermined storage locations, said storage locations arranged into a first half and a second half, said storage locations in said first half having means to initially receive a string of predetermined data symbols, and said storage locations in said second half having means to receive a string of uncoded data symbols, a systolic array of processors which selectively receive data symbols from predetermined storage locations in each half of said symbol storage means, each of said processors having a first comparator which compares selected pairs of data symbols to determine the length of a string of data symbols in the data locations in the first half of the symbol storage means which match a string of data symbols in the storage locations in the second half of the symbol storage means, and a second comparator which compares the length of the matched string with the length of the matched string from an adjacent processor to determine the longest matched string, each processor passing the length of the longest matched string and the location in the first half of said symbol storage means of the starting point for the string, to an adjacent processor, said systolic array of processors providing fixed length outputs indicating;
i) the length of the longest string of data symbols from the storage locations in the first half of said symbol storage means that matches a string of data symbols in the storage locations from the second half of said symbol storage means and ii) the location in the first half of said symbol storage means of the starting point for said string,said symbol storage means having means for shifting the data symbols of said longest string and the data symbol immediately following said longest string from the storage locations in the second half of said symbol storage means in a predetermined sequence into the storage locations in the first half of said symbol storage means and means for shifting subsequent uncoded data symbols into the storage locations in the second half of said symbol storage means, said symbol storage means having means for providing a fixed length output indicating the data symbol in the storage location in the first half of the symbol storage means immediately following said longest string, and a codeword storage means having means to receive said fixed length outputs from said systolic array of processing means and said fixed length output from said symbol storage means, said codeword storage means having means to provide a codeword having a predetermined maximum length indicating the length of said longest string, the starting point of said longest string and the data symbol immediately following said longest string.
-
Specification