Data compression and decompression system with immediate dictionary updating interleaved with string search
First Claim
1. Data compression apparatus for compressing a stream of data character signals into a stream of compressed code signals, comprisingstorage means for storing strings of data character signals, each said string having a code signal associated therewith,means for searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined,means for providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals,means for entering extended strings into said storage means, the entry of said extended strings being interleaved with the matching of the data character signals of said character-by-character match, an extended string PWC comprising a previously matched string P corresponding to a last provided code signal extended by a string WC, where W is a partial string in the process of being matched during said character-by-character match and C is a data character signal continuing the match of W during said character-by-character match, andmeans for assigning respective code signals to said extended strings.
9 Assignments
0 Petitions
Accused Products
Abstract
A dictionary based data compression and decompression system where, in the compressor, when a partial string W and a character C are matched in the dictionary, a new string is entered into the dictionary with C as an extension character on the string PW where P is the string corresponding to the last output compressed code signal. An update string is entered into the compression dictionary for each input character that is read and matched. The updating is immediate and interleaved with the character-by-character matching of the current string. The update process continues until the longest match is found in the dictionary. The code of the longest matched string is output in a string matching cycle. If a single character or multi-character string "A" exists in the dictionary, the string AAA . . . A is encoded in two compressed code signals regardless of the string length. This encoding results in an unrecognized code signal at the decompressor. The decompressor, in response to an unrecognized code signal, enters update strings into the decompressor dictionary in accordance with the recovered string corresponding to the previously received code signal, the unrecognized code signal, the extant code of the decompressor and the number of characters in the previously recovered string.
52 Citations
50 Claims
-
1. Data compression apparatus for compressing a stream of data character signals into a stream of compressed code signals, comprising
storage means for storing strings of data character signals, each said string having a code signal associated therewith, means for searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined, means for providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals, means for entering extended strings into said storage means, the entry of said extended strings being interleaved with the matching of the data character signals of said character-by-character match, an extended string PWC comprising a previously matched string P corresponding to a last provided code signal extended by a string WC, where W is a partial string in the process of being matched during said character-by-character match and C is a data character signal continuing the match of W during said character-by-character match, and means for assigning respective code signals to said extended strings.
-
5. Data compression apparatus for compressing a stream of data character signals into a stream of compressed code signals, comprising
storage means for storing strings of data character signals, each said string having a code signal associated therewith, means for searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined, means for providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals, means for entering into said storage means, interleaved with said character-by-character match, extended strings comprising a previously matched string corresponding to a last provided code signal extended by each data character signal, in turn, as each data character signal is matched, and means for assigning respective code signals to said extended strings, said apparatus operating in successive string matching cycles, a current cycle following a previous cycle, said means for searching and said means for entering being operative so that when a partial string W and a data character signal C are matched, one of said extended strings is entered into said storage means with said data character signal C as an extension character of a string PW where P is said previously matched string and W is in the process of being matched in said current cycle.
-
10. Data compression apparatus for compressing a stream of data character signals into a stream of compressed code signals, comprising
storage means for storing strings of data character signals, each said string having a code signal associated therewith, means for searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined, means for providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals, means for entering into said storage means, interleaved with said character-by-character match, extended strings comprising a previously matched string corresponding to a last provided code signal extended by each data character signal, in turn, as each data character signal is matched, and means for assigning respective code signals to said extended strings, said strings being stored in said storage means in a linked tree structure and said stream of data character signals including a repeating character string comprised of a repeating character, said apparatus being operative for compressing said repeating character string in two compressed code signals irrespective of the length thereof.
-
12. Data compression apparatus for compressing a stream of data character signals into a stream of compressed code signals, comprising
storage means for storing strings of data character signals, each said string having a code signal associated therewith, means for searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined, means for providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals, means for entering into said storage means, interleaved with said character-by-character match, extended strings comprising a previously matched string corresponding to a last provided code signal extended by each data character signal, in turn, as each data character signal is matched, and means for assigning respective code signals to said extended strings, said strings being stored in said storage means, in a linked tree structure and said stream of data character signals including a repeating character group string comprised of a repeating character group, said apparatus being operative for compressing said repeating character group string in two compressed code signals irrespective of the length thereof.
-
14. Data decompression apparatus for receiving a stream of compressed code signals provided by a data compressor responsive to a stream of data character signals, said apparatus operative for recovering said stream of data character signals from said stream of compressed code signals, comprising
storage means for storing strings of data character signals, each said string having a code signal associated therewith, means for accessing said storage means with a current received compressed code signal for recovering a string from said storage means corresponding to said current received compressed code signal so as to recover the data character signals thereof, means for entering into said storage means, extended strings comprising a previously recovered string corresponding to a previously received compressed code signal extended by each data character signal, in turn, of said recovered string corresponding to said current received compressed code signal, means for assigning respective code signals to said extended strings, further means for entering into said storage means, further extended strings in response to receiving an unrecognized compressed code signal, said further extended strings comprising said previously recovered string corresponding to said previously received compressed code signal extended by each data character signal, in turn, of said previously recovered string, sequentially and repeatedly, until a string corresponding to said unrecognized compressed code signal is entered, means for assigning respective code signals to said further extended strings, and means for providing the data character signals of said string corresponding to said unrecognized compressed code signal so as to recover said data character signals.
-
21. Data decompression apparatus for receiving a stream of compressed code signals provided by a data compressor responsive to a stream of data character signals, said stream of data character signals including a repeating character string comprised of a repeating character, said data compressor being operative for compressing said repeating character string into two consecutive compressed code signals, said apparatus operative for recovering said stream of data character signals from said stream of compressed code signals, comprising
storage means for storing strings of data character signals, each said string having a code signal associated therewith said strings being stored in said storage means in a linked tree structure, means for accessing said storage means with a current received compressed code signal for recovering a string from said storage means corresponding to said current received compressed code signal so as to recover the data character signals thereof, means for entering into said storage means, extended strings comprising a previously recovered string corresponding to a previously received compressed code signal extended by each data character signal, in turn, of said recovered string corresponding to said current received compressed code signal, means for assigning respective code signals to said extended strings, further means for entering into said storage means, further extended strings in response to receiving an unrecognized compressed code signal, said further extended strings comprising said previously recovered string corresponding to said previously received compressed code signal extended by each data character signal, in turn, of said previously recovered string, sequentially and repeatedly, until a string corresponding to said unrecognized compressed code signal is entered, means for assigning respective code signals to said further extended strings, means for providing the data character signals of said string corresponding to said unrecognized compressed code signal so as to recover said data character signals, said further means for entering being operative for entering into said storage means, additional extended strings comprising said string corresponding to said unrecognized compressed code signal further extended by the data character signals of said previously recovered string, in turn, until a number of said additional extended strings are entered into said storage means, said number being equal to the number of data character signals comprising said previously recovered string, and means for assigning respective code signals to said additional extended strings.
-
23. Data decompression apparatus for receiving a stream of compressed code signals provided by a data compressor responsive to a stream of data character signals, said stream of data character signals including a repeating character group string comprised of a repeating character group, said data compressor being operative for compressing said repeating character group string into two consecutive compressed code signals, said apparatus operative for recovering said stream of data character signals from said stream of compressed code signals, comprising
storage means for storing strings of data character signals, each said string having a code signal associated therewith, said strings being stored in said storage means in a linked tree structure, means for accessing said storage means with a current received compressed code signal for recovering a string from said storage means corresponding to said current received compressed code signal so as to recover the data character signals thereof, means for entering into said storage means, extended strings comprising a previously recovered string corresponding to a previously received compressed code signal extended by each data character signal, in turn, of said recovered string corresponding to said current received compressed code signal, means for assigning respective code signals to said extended strings, further means for entering into said storage means, further extended strings in response to receiving an unrecognized compressed code signal, said further extended strings comprising said previously recovered string corresponding to said previously received compressed code signal extended by each data character signal, in turn, of said previously recovered string, sequentially and repeatedly, until a string corresponding to said unrecognized compressed code signal is entered, means for assigning respective code signals to said further extended strings, means for providing the data character signals of sail string corresponding to said unrecognized compressed code signal so as to recover said data character signals, said further means for entering being operative for entering into said storage means, additional extended strings comprising said string corresponding to said unrecognized compressed code signal further extended by the data character signals of said previously recovered string, in turn, until a number of said additional extended strings are entered into said storage means, said number being equal to the number of data character signals comprising said previously recovered string, and means for assigning respective code signals to said additional extended strings.
-
25. Data decompression apparatus for recovering a stream of data character signals from a stream of compressed code signals, comprising
means for receiving said stream of compressed code signals, storage means for storing strings of data character signals, each said string having a code signal associated therewith, means for accessing said storage means with a current received compressed code signal for recovering a string from said storage means corresponding to said current received compressed code signal so as to recover the data character signals thereof, means for entering into said storage means, extended strings comprising a previously recovered string corresponding to a previously received compressed code signal extended by each data character signal, in turn, of said recovered string corresponding to said current received compressed code signal, further means for entering into said storage means, further extended strings in response to receiving an unrecognized compressed code signal, means for assigning respective code signals to said extended strings and to said further extended strings, said further means for entering being operative for generating said further extended strings in accordance with said previously recovered string, the number of data character signals comprising said previously recovered string, said unrecognized compressed code signal and said code signals being assigned to said further extended strings, one of said further extended strings being a string corresponding to said unrecognized compressed code signal, and means for providing the data character signals of said string corresponding to said unrecognized compressed code signal so as to recover said data character signals.
-
26. A data compression method for compressing a stream of data character signals into a stream of compressed code signals, comprising
storing strings of data character signals in storage means, each said string having a code signal associated therewith, searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined, providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals, entering extended strings into said storage means, the entry of said extended strings being interleaved with the matching of the data character signals of said character-by-character match, an extended string PWC comprising a previously matched string P corresponding to a last provided code signal extended by a string WC, where W is a partial string in the process of being matched during said character-by-character match and C is a data character signal continuing the match of W during said character-by-character match, and assigning respective code signals to said extended strings.
-
30. A data compression method for compressing a stream of data character signals into a stream of compressed code signals, comprising
storing strings of data character signals in storage means, each said string having a code signal associated therewith, searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined, providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals, entering into said storage means, interleaved with said character-by-character match, extended strings comprising a previously matched string corresponding to a last provided code signal extended by each data character signal, in turn, as each data character signal is matched, and assigning respective code signals to said extended strings, said method operating in successive string matching cycles, a current cycle following a previous cycle, said searching and said entering steps being performed so that when a partial string W and a data character signal C are matched, one of said extended strings is entered into said storage means with said data character signal C as an extension character of a string PW where P is said previously matched string and W is in the process of being matched in said current cycle.
-
35. A data compression method for compressing a stream of data character signals into a stream of compressed code signals, comprising
storing strings of data character signals in storage means, each said string having a code signal associated therewith, searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined, providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals, entering into said storage means, interleaved with said character-by-character match, extended strings comprising a previously matched string corresponding to a last provided code signal extended by each data character signal, in turn, as each data character signal is matched, and assigning respective code signals to said extended strings, said storing step comprising storing said strings in said storage means in a linked tree structure, said stream of data character signals including a repeating character string comprised of a repeating character, said method compressing said repeating character string in two compressed code signals irrespective of the length thereof.
-
37. A data compression method for compressing a stream of data character signals into a stream of compressed code signals, comprising
storing strings of data character signals in storage means, each said string having a code signal associated therewith, searching said stream of data character signals by comparing said stream to said stored strings to perform a character-by-character match therewith until a predetermined match is determined, providing the code signal associated with said predetermined match so as to provide said stream of compressed code signals, entering into said storage means, interleaved with said character-by-character match, extended strings comprising a previously matched string corresponding to a last provided code signal extended by each data character signal, in turn, as each data character signal is matched, and assigning respective code signals to said extended strings, said storing step comprising storing said strings in said storage means in a linked tree structure, said stream of data character signals including a repeating character group string comprised of a repeating character group, said method compressing said repeating character group string in two compressed code signals irrespective of the length thereof.
-
39. A data decompression method for receiving a stream of compressed code signals provided by a data compressor responsive to a stream of data character signals, said method recovering said stream of data character signals from said stream of compressed code signals, comprising
storing strings of data character signals in storage means, each said string having a code signal associated therewith, accessing said storage means with a current received compressed code signal for recovering a string from said storage means corresponding to said current received compressed code signal so as to recover the data character signals thereof, entering into said storage means, extended strings comprising a previously recovered string corresponding to a previously received compressed code signal extended by each data character signal, in turn, of said recovered string corresponding to said current received compressed code signal, assigning respective code signals to said extended strings, further entering into said storage means, further extended strings in response to receiving an unrecognized compressed code signal, said further extended strings comprising said previously recovered string corresponding to said previously received compressed code signal extended by each data character signal, in turn, of said previously recovered string, sequentially and repeatedly, until a string corresponding to said unrecognized compressed code signal is entered, assigning respective code signals to said further extended strings, and providing the data character signals of said string corresponding to said unrecognized compressed code signal so as to recover said data character signals.
-
46. A data decompression method for receiving a stream of compressed code signals provided by a data compressor responsive to a stream of data character signals, said stream of data character signals including a repeating character string comprised of a repeating character, said data compressor compressing said repeating character string into two consecutive compressed code signals, said method recovering said stream of data character signals from said stream of compressed code signals, comprising
storing strings of data character signals in storage means, each said string having a code signal associated therewith, said storing step comprising storing said strings in said storage means in a linked tree structure, accessing said storage means with a current received compressed code signal for recovering a string from said storage means corresponding to said current received compressed code signal so as to recover the data character signals thereof, entering into said storage means, extended strings comprising a previously recovered string corresponding to a previously received compressed code signal extended by each data character signal, in turn, of said recovered string corresponding to said current received compressed code signal, assigning respective code signals to said extended strings, further entering into said storage means, further extended strings in response to receiving an unrecognized compressed code signal, said further extended strings comprising said previously recovered string corresponding to said previously received compressed code signal extended by each data character signal, in turn, of said previously recovered string, sequentially and repeatedly, until a string corresponding to said unrecognized compressed code signal is entered, assigning respective code signals to said further extended strings, providing the data character signals of said string corresponding to said unrecognized compressed code signal so as to recover said data character signals, said further entering step including entering into said storage means, additional extended strings comprising said string corresponding to said unrecognized compressed code signal further extended by the data character signals of said previously recovered string, in turn, until a number of said additional extended strings are entered into said storage means, said number being equal to the number of data character signals comprising said previously recovered string, and assigning respective code signals to said additional extended strings.
-
48. A data decompression method for receiving a stream of compressed code signals provided by a data compressor responsive to a stream of data character signals, said stream of data character signals including a repeating character group string comprised of a repeating character group, said data compressor compressing said repeating character group string into two consecutive compressed code signals, said method recovering said stream of data character signals from said stream of compressed code signals, comprising
storing strings of data character signals in storage means, each said string having a code signal associated therewith, said storing step comprising storing said strings in said storage means in a linked tree structure. accessing said storage means with a current received compressed code signal for recovering a string from said storage means corresponding to said current received compressed code signal so as to recover the data character signals thereof, entering into said storage means, extended strings comprising a previously recovered string corresponding to a previously received compressed code signal extended by each data character signal, in turn, of said recovered string corresponding to said current received compressed code signal, assigning respective code signals to said extended strings, further entering into said storage means, further extended strings in response to receiving an unrecognized compressed code signal, said further extended strings comprising said previously recovered string corresponding to said previously received compressed code signal extended by each data character signal, in turn, of said previously recovered string, sequentially and repeatedly, until a string corresponding to said unrecognized compressed code signal is entered, assigning respective code signals to said further extended strings, providing the data character signals of said string corresponding to said unrecognized compressed code signal so as to recover said data character signals, said further entering step including entering into said storage means, additional extended strings comprising said string corresponding to said unrecognized compressed code signal further extended by the data character signals of said previously recovered string, in turn, until a number of said additional extended strings are entered into said storage means, said number being equal to the number of data character signals comprising said previously recovered string, and assigning respective code signals to said additional extended strings.
-
50. A data decompression method for recovering a stream of data character signals from a stream of compressed code signals, comprising
receiving said stream of compressed code signals, storing strings of data character signals in storage means, each said string having a code signal associated therewith, accessing said storage means with a current received compressed code signal for recovering a string from said storage means corresponding to said current received compressed code signal so as to recover the data character signals thereof, entering into said storage means, extended strings comprising a previously recovered string corresponding to a previously received compressed code signal extended by each data character signal, in turn, of said recovered string corresponding to said current received compressed code signal, further entering into said storage means, further extended strings in response to receiving an unrecognized compressed code signal, assigning respective code signals to said extended strings and to said further extended strings, said further entering step includes generating said further extended strings in accordance with said previously recovered string, the number of data character signals comprising said previously recovered string, said unrecognized compressed code signal and said code signals being assigned to said further extended strings, one of said further extended strings being a string corresponding to said unrecognized compressed code signal, and providing the data character signals of said string corresponding to said unrecognized compressed code signal so as to recover said data character signals.
Specification