Fast approximation to optimal compression of digital data
First Claim
1. In a computing system environment, a method of compressing new data from a new file, comprising:
- extracting key compression/decompression information from an earlier compressed file having original data with bit patterns determined to be similar to bit patterns of the new data of the new file; and
encoding the new data based on the extracted key information,wherein the extracting key compression/decompression information from the earlier compressed file includes extracting a dictionary, the dictionary having defined symbols representing the original data.
15 Assignments
0 Petitions
Accused Products
Abstract
A “fast approximation” of compression of current data involves using information obtained from an earlier compression of similar data. It overcomes the iterative process of discovering a unique set of optimal symbols. Representatively, a dictionary of symbols corresponding to original data from an earlier compressed file is extracted. Original bits are then obtained from the symbols. Sequences of the original bits are identified in the current data of a current file under consideration. A new bit stream for the current file is created from the original bits and according to the symbols they represent. Every occurrence of the symbols is counted in the new bit stream and a path-weighted Huffman tree is created from the counted occurrences. A coding from the Huffman tree ensues, along with an end-of-file marker. The latter is stored in a new compression file, including the dictionary earlier extracted from the earlier compressed file.
103 Citations
2 Claims
-
1. In a computing system environment, a method of compressing new data from a new file, comprising:
-
extracting key compression/decompression information from an earlier compressed file having original data with bit patterns determined to be similar to bit patterns of the new data of the new file; and encoding the new data based on the extracted key information, wherein the extracting key compression/decompression information from the earlier compressed file includes extracting a dictionary, the dictionary having defined symbols representing the original data.
-
-
2. In a computing system environment, a method of compressing new data of a new file, comprising:
-
obtaining a dictionary defining all symbols representing compressed original bits of data from an earlier compressed file having original data with bit patterns determined to be similar to bit patterns of the new data of the new file; and encoding the new data based thereon, wherein the encoding further includes determining original bits corresponding to all symbols in the dictionary, the earlier compressed file having the original data arranged as a plurality of symbols, identifying sequences of the original bits in the new data of the new file, wherein the identifying sequences further includes identifying longest possible sequences of the original bits in the new data of the new file, creating a new bit stream of symbols representing the identified sequences of the original bits, counting occurrences of every symbol in the new bit stream, creating a Huffman encoding in bits from the counted occurrences, storing the bits in a compressed current file, the compressed current file also including the dictionary from the earlier compressed file having original data similar to the new data of the new file.
-
Specification