Method and apparatus for improving a compression ratio of multiple documents by using templates
First Claim
1. A computer-implemented method comprising:
- concatenating a data object in a set of data objects with a template for the set of data objects;
compressing the concatenated data object/template pair; and
determining a difference between the compressed concatenated data object/template pair and a compressed version of the template for the set of data objects.
3 Assignments
0 Petitions
Accused Products
Abstract
Example embodiments of the present invention effectively manage a large set of records such that each can be quickly accessed while still reducing the system capacity used for storing the records by taking into account specifics of the record structure. A template document is constructed for a large set of similar documents, such that it represents the maximum common portion of content in the document set. The template is compressed and stored. Every document in the set is then concatenated individually to the uncompressed template and the concatenated result is compressed. The compressed template is then subtracted from the combined compressed result. The result of this subtraction is stored in the data store for each document. Effectively, only the compressed difference between each document and the template is stored, which reduces significantly the amount of capacity necessary for storing the document set (e.g., by a factor of 5 or 10).
42 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
concatenating a data object in a set of data objects with a template for the set of data objects; compressing the concatenated data object/template pair; and determining a difference between the compressed concatenated data object/template pair and a compressed version of the template for the set of data objects. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
-
a data store; and a computer including memory storing computer-executable logic that, when executed by the computer, causes the computer to perform the operations of; concatenating a data object in a set of data objects with a template for the set of data objects; compressing the concatenated data object/template pair; and determining a difference between the compressed concatenated data object/template pair and a compressed version of the template for the set of data objects. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer program product including a non-transitory computer-readable storage medium encoded with computer program code that, when executed on a processor of a computer, causes the computer to perform template based compression, the computer program product comprising:
-
computer program code for concatenating a data object in a set of data objects with a template for the set of data objects; computer program code for compressing the concatenated data object/template pair; and computer program code for determining a difference between the compressed concatenated data object/template pair and a compressed version of the template for the set of data objects. - View Dependent Claims (18, 19, 20)
-
Specification