Compressing and decompressing text files
First Claim
1. A method of compressing a text file stored in a computer memory in digital form, comprising:
- generating a full text file having characters formed into phrases, said characters being digitally represented by bytes;
generating a first level compressed text file from said text full file by replacing runs of identical characters with a run flag, the character and a repetition count;
generating a second level compressed text file from said first level compressed text file by replacing frequently occurring phrases in said first level compressed text file with a key phrase flag byte and an index byte; and
generating a third level compressed text file from said second level compressed text file by replacing frequently occurring bytes in said second level compressed text file with a unique string of bits.
2 Assignments
0 Petitions
Accused Products
Abstract
A method of compressing a text file in digital form is disclosed. A full text file having characters formed into phrases is provided by an author. The characters are digitally represented by bytes. A first pass compression is sequentially followed by a second pass compression of the text which has previously been compressed. A third or fourth level compression is serially performed on the previously compressed text. For example, in a first pass, the text is run-length compressed. In a second pass, the compressed text is further compressed with key phrase compression. In a third pass, the compressed text is further compressed with Huffman compression. The compressed text is stored in a text file having a Huffman decode tree, a key phrase table, and a topic index. The data is decompressed in a single pass and provided one line at a time as an output. Sequential compressing of the text minimizes the storage space required for the file. Decompressing of the text is performed in a single pass. As a complete line is decompressed, it is output rapidly, providing full text to a user.
101 Citations
12 Claims
-
1. A method of compressing a text file stored in a computer memory in digital form, comprising:
-
generating a full text file having characters formed into phrases, said characters being digitally represented by bytes; generating a first level compressed text file from said text full file by replacing runs of identical characters with a run flag, the character and a repetition count; generating a second level compressed text file from said first level compressed text file by replacing frequently occurring phrases in said first level compressed text file with a key phrase flag byte and an index byte; and generating a third level compressed text file from said second level compressed text file by replacing frequently occurring bytes in said second level compressed text file with a unique string of bits. - View Dependent Claims (2, 3, 4, 5)
-
-
6. The method of locating and decompressing text stored in the memory of a computer comprising:
-
comparing a sample string to a plurality of reference strings until a match is found; stepping to a topic address in memory based on which reference string matches said sample string; stepping to an address containing compressed text stored in memory based on an address provided by said topic address; retrieving compressed text from memory; decompressing said text from compressed form to a standard full format form ready for display on a computer monitor; and outputting said full format text. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
Specification