Data compression system based on tree models
First Claim
1. A method for encoding a sequence into a concatenated string, comprising:
- building a suffix tree of the sequence in reverse order;
pruning the suffix tree to form a generalized context tree (GCT) having a plurality of states;
obtaining a binary representation of a full tree derived from the GCT;
encoding the sequence into a binary string using a dynamic tree model based on statistics collected at the plurality of states of the GCT; and
concatenating the binary representation of the full tree with the binary string to form the concatenated string.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for encoding and decoding a sequence is provided. The method comprises searching a set of candidate trees varying in size for a tree T having a plurality of states. Tree T provides a structure that relatively minimizes code length of the sequence from among all the candidate trees. The method further comprises encoding data conditioned on the tree T, which may be a generalized context tree (GCT), using a sequential probability assignment conditioned on the states of the tree T. This encoding may use finite state machine (FSM) closure of the tree. Also provided are methods for decoding an encoded binary string when the encoded string includes a full tree or generalized context tree, as well as decoding an encoded string using incomplete FSM closure, incremental FSM, and suffix tree construction concepts.
-
Citations
30 Claims
-
1. A method for encoding a sequence into a concatenated string, comprising:
-
building a suffix tree of the sequence in reverse order;
pruning the suffix tree to form a generalized context tree (GCT) having a plurality of states;
obtaining a binary representation of a full tree derived from the GCT;
encoding the sequence into a binary string using a dynamic tree model based on statistics collected at the plurality of states of the GCT; and
concatenating the binary representation of the full tree with the binary string to form the concatenated string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for encoding a sequence into a concatenated string, comprising:
-
building a suffix tree of the sequence in reverse order;
pruning the suffix tree to form a generalized context tree (GCT) having a plurality of states;
building a finite state machine (FSM) closure of the GCT to form an FSM closed GCT;
obtaining a binary representation of a full tree derived from the GCT;
encoding the sequence into a binary string using a dynamic tree model based on statistics collected at the plurality of states of the GCT;
transitioning to a next state of the GCT with the FSM closed GCT; and
concatenating the binary representation of the full representation of the full tree with the binary string to form the concatenated string. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A method for decoding a binary string comprising a binary representation of a full tree having a plurality of states associated therewith and an encoded string produced by a corresponding encoder using a dynamic tree model based on the full tree, comprising:
-
building a finite state machine (FSM) closure of the full tree;
iteratively decoding at least one symbol using the dynamic tree model of the corresponding encoder based on statistics collected at the plurality of states of the full tree; and
transitioning to a next state of the full tree using the FSM closed full tree. - View Dependent Claims (17, 18)
-
-
19. A method for decoding a binary string, the binary string comprising a binary representation of a generalized context tree (GCT) and an encoded string produced by a corresponding encoder using a dynamic tree model based on the GCT, the GCT having a plurality of states associated therewith, comprising:
-
building a decoding GCT based on the binary representation of the GCT;
building a finite state machine (FSM) closure of the decoding GCT;
iteratively decoding at least one symbol using the dynamic tree model of the corresponding encoder based on statistics collected at the plurality of states of the decoding GCT; and
transitioning to a next state of the decoding GCT using the FSM closed decoding GCT. - View Dependent Claims (20)
-
-
21. A method for decoding a binary string, said binary string comprising a binary representation of a full tree and an encoded string produced by a corresponding encoder using a dynamic tree model based on the full tree, the full tree having a plurality of states associated therewith, comprising:
-
building a decoding full tree based on the binary representation of the full tree;
creating a reduced generalized context tree (GCT) and mapping the reduced GCT to the decoding full tree;
building a finite state machine (FSM) closure of the reduced GCT;
iteratively decoding at least one symbol using the dynamic tree model of the corresponding encoder based on statistics collected at the plurality of states of the decoding full tree; and
transitioning to a next state of the decoding full tree using the FSM closed reduced GCT. - View Dependent Claims (22, 23, 24)
-
-
25. A method for decoding a binary string, said binary string comprising a binary representation of a full tree and an encoded string produced by a corresponding encoder using a dynamic tree model based on the full tree, the full tree having a plurality of states associated therewith, comprising:
-
building a decoding full tree based on the binary representation of the full tree;
creating a reduced generalized context tree (GCT);
building a finite state machine (FSM) closure of the reduced GCT;
iteratively decoding at least one symbol using the dynamic tree model of the corresponding encoder based on statistics collected at the plurality of states of the decoding full tree;
transitioning to a next state of the decoding full tree using the FSM closed GCT; and
adding encountered states from the decoding full tree and suffixes thereof to the FSM closure of the reduced GCT. - View Dependent Claims (26, 27, 28, 29)
-
-
30. A method for decoding a binary string ty, formed by concatenating binary strings t and y into a resultant string x, said binary string ty comprising a binary representation t of a tree T and an encoded string y produced by a corresponding encoder using a dynamic tree model based on the tree T, comprising:
-
building tree T based on the binary representation t;
setting the resultant string x to an empty string;
iteratively decoding at least one symbol using the dynamic tree model of the corresponding encoder based on statistics collected at a state given by a longest ancestor of reversed resultant string x originally in T;
filling the resultant string x with decoded symbols; and
inserting the reversed resultant string x in the tree T.
-
Specification