Dynamic model selection during data compression
First Claim
1. A computerized method of maximizing the compression of a stream of input symbols by dynamically selecting the best of a plurality of data compression models for respectively encoding each of successive portions of said stream, wherein each model, in response to input symbols, generates encoded data, based on model characteristics, in the form of respective strings of data, comprising the steps of:
- beginning with the same input symbol in said stream, using each model to generate respective strings of data from the stream of input symbols;
compression coding said strings of data to produce blocks of compressed data, each of said blocks being at least equal to a given size;
selecting the respective block of compressed data for which the most input symbols have been compressed;
adding the selected block of compressed data to an output data stream; and
beginning with the input symbol following the last symbol input to said selected model, reiterating the foregoing steps on the next succeeding portion of said input symbol stream.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for maximizing data compression by optimizing model selection during coding of an input stream of data symbols, wherein at least two models are run and compared, and the model with the best coding performance for a given-size segment or block of compressed data is selected such that only its block is used in an output data stream. The best performance is determined by 1) respectively producing comparable-size blocks of compressed data from the input stream with the use of the two, or more, models and 2) selecting the model which compresses the most input data. In the preferred embodiment, respective strings of data are produced with each model from the symbol data and are coded with an adaptive arithmetic coder into the compressed data. Each block of compressed data is started by coding the decision to use the model currently being run and all models start with the arithmetic coder parameters established at the end of the preceding block. Only the compressed code stream of the best model is used in the output and that code stream has in it the overhead for selection of that model. Since the decision as to which model to run is made in the compressed data domain, i.e., the best model is chosen on the basis of which model coded the most input symbols for a given-size compressed block, rather than after coding a given number of input symbols, the model selection decision overhead scales with the compressed data. Successively selected compressed blocks are combined as an output code stream to produce an optimum output of compressed data, from input symbols, for storage or transmission.
-
Citations
20 Claims
-
1. A computerized method of maximizing the compression of a stream of input symbols by dynamically selecting the best of a plurality of data compression models for respectively encoding each of successive portions of said stream, wherein each model, in response to input symbols, generates encoded data, based on model characteristics, in the form of respective strings of data, comprising the steps of:
-
beginning with the same input symbol in said stream, using each model to generate respective strings of data from the stream of input symbols; compression coding said strings of data to produce blocks of compressed data, each of said blocks being at least equal to a given size; selecting the respective block of compressed data for which the most input symbols have been compressed; adding the selected block of compressed data to an output data stream; and beginning with the input symbol following the last symbol input to said selected model, reiterating the foregoing steps on the next succeeding portion of said input symbol stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computerized method of selecting one of a plurality of data compression models for compressing data symbols in a stream of data symbols to maximize compression of the stream, comprising the steps of:
-
using each of said plurality of models to form a respective block of compressed data from data symbols input from the stream, said forming beginning with the same data symbol in said stream and ending with said respective blocks of compressed data being of comparable size; and selecting the model which produces the block, from among said respective blocks, for which the most input symbols have been compressed. - View Dependent Claims (12, 13)
-
-
14. A computerized method of dynamically selecting the best of a plurality of data compression models for use in arithmetic coding and decoding wherein each model generates a string of data in response to a stream of input symbols, comprising the steps of:
-
beginning with the same symbol in said input symbol stream, using each of said models to produce respective strings of data from said input symbols including a model selection decision for the respective model; coding the model selection decision and the string of data for each model with an adaptive arithmetic coder to generate respective blocks of compressed data for each model, with each of said blocks being of at least a given size and starting the coding of each model for a given block with the same code stream and probability interval; and selecting the model with the best coding performance from among the blocks of compressed data generated during the arithmetic coding on the basis of the amount of input symbol data compressed.
-
-
15. A computerized system for maximizing the compression of a stream of input symbols by dynamically selecting the best of a plurality of data compression models for respectively encoding each of successive portions of said stream, wherein each model, in response to input symbols, generates encoded data, based on model characteristics, in the form of respective strings of data, comprising:
-
means, beginning with the same input symbol in said stream, for generating respective strings of data from the stream of input symbols using each of said data compression models; means for compression coding said strings of data to produce respective blocks of compressed data, each of said blocks being at least equal to a given size; means for selecting the respective block of compressed data for which the most input symbols have been compressed; means for adding the selected block of compressed data to an output data stream; and means for actuating said foregoing means to act on the next succeeding portion of said input symbol stream, beginning with the input symbol following the last symbol input to said selected model, until all of the portions of said input stream have been compressed and added to said output data stream. - View Dependent Claims (16, 17, 18)
-
-
19. A computerized system for selecting one of a plurality of data compression models for compressing data symbols in a stream of data symbols to maximize compression of the stream, comprising:
-
means, using each of said plurality of models, for forming a respective block of compressed data from data symbols input from the stream, said forming beginning with the same data symbol in said stream and ending with said respective blocks of compressed data being of comparable size; and means for selecting the block, from among said respective blocks, for which the most input symbols have been compressed.
-
-
20. A computerized system for dynamically selecting one of a plurality of data compression models for use in arithmetic coding and decoding wherein each model generates a string of data in response to a stream of input symbols, comprising the steps of:
-
means, beginning with the same symbol in said input symbol stream, for producing respective strings of data from said input symbols using each of said models and including a model selection decision for the respective model; adaptive arithmetic coder means for coding the model selection decision and the string of data for each model to generate respective blocks of compressed data for each model, with each of said blocks being of at least a given size and starting the coding of each model for a given block with the same code stream and probability interval; and means for selecting the model with the optimum coding performance from among the blocks of compressed data generated during the arithmetic coding on the basis of the amount of input symbol data compressed.
-
Specification