Rollover strategies in a N-bit dictionary compressed column store
First Claim
1. A method, comprising:
- receiving a new value for addition to a compressed column store, the compressed column store including a plurality of tokens, each token corresponding to a value in a data dictionary, and being associated with a row identifier (RID) in the compressed column store;
determining an insertion block of the compressed column store, the insertion block being a physical or virtual memory block where new tokens are inserted;
determining that a current memory block of a most recently added token to the compressed column store is the insertion block, the compressed column store including one or more memory blocks each with a maximum token value that indicates a storage capacity for tokens within a respective memory block based on an encoding of the respective memory block;
determining that the maximum token value has been reached for the current memory block based on type of encoding of the current memory block;
creating a new virtual memory block using the current memory block, wherein the new virtual memory block has an encoding greater than the encoding of the current memory block, wherein the new virtual memory is designated as the insertion block, and wherein the tokens of the current memory block remain in the current memory block while new tokens are stored in the virtual memory block; and
storing a token corresponding to the new value in the new virtual memory block, wherein the token corresponding to the new value in the new virtual memory block is accessed in a same manner as existing tokens in the current memory block.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein are system, method, and computer program product embodiments for rollover strategies in an n-bit dictionary compressed column store. An embodiment operates by receiving a new value for addition to a compressed column store, determining that a current memory block of a most recently added token to the compressed column store is the insertion block. It is determined that the maximum token value has been reached for the current memory block. A new virtual memory block is created using the current insertion block, and a token corresponding to the new value is stored in the new virtual memory block. In another embodiment, when it is determined a maximum number of token values that may be stored in a compressed column store has been reached for a data dictionary, the compressed column store is converted into a composite store include a flat store where the new value is stored.
34 Citations
17 Claims
-
1. A method, comprising:
-
receiving a new value for addition to a compressed column store, the compressed column store including a plurality of tokens, each token corresponding to a value in a data dictionary, and being associated with a row identifier (RID) in the compressed column store; determining an insertion block of the compressed column store, the insertion block being a physical or virtual memory block where new tokens are inserted; determining that a current memory block of a most recently added token to the compressed column store is the insertion block, the compressed column store including one or more memory blocks each with a maximum token value that indicates a storage capacity for tokens within a respective memory block based on an encoding of the respective memory block; determining that the maximum token value has been reached for the current memory block based on type of encoding of the current memory block; creating a new virtual memory block using the current memory block, wherein the new virtual memory block has an encoding greater than the encoding of the current memory block, wherein the new virtual memory is designated as the insertion block, and wherein the tokens of the current memory block remain in the current memory block while new tokens are stored in the virtual memory block; and storing a token corresponding to the new value in the new virtual memory block, wherein the token corresponding to the new value in the new virtual memory block is accessed in a same manner as existing tokens in the current memory block. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
-
a processor; and a tangible memory communicatively coupled to the processor including instructions thereon that when executed by the processor cause the processor to; receive a new value for addition to a compressed column store, the compressed column store including a plurality of tokens, each token corresponding to a value in a data dictionary, and being associated with a row identifier (RID) in the compressed column store, determine an insertion block of the compressed column store, the insertion block being a physical or virtual memory block where new tokens are inserted, determine that a current memory block of a most recently added token to the compressed column store is the insertion block, the compressed column store including one or more memory blocks each with a maximum token value that indicates a storage capacity for tokens within a respective memory block based on an encoding of the respective memory block, determine that the maximum token value has been reached for the current memory block based on type of encoding of the current memory block create a new virtual memory block using the current memory block, wherein the new virtual memory block has an encoding greater than the encoding of the current memory block, wherein the new virtual memory is designated as the insertion block, and wherein the tokens of the current memory block remain in the current memory block while new tokens are stored in the virtual memory block, and store a token corresponding to the new value in the new virtual memory block, wherein the token corresponding to the new value in the new virtual memory block is accessed in a same manner as existing tokens in the current memory block. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification