Method and system for database storage management
First Claim
Patent Images
1. A computer-implemented method for performing database storage management, comprising:
- encoding a first sequence of information comprising a compression of at least one successive repetition of values in the first sequence of information;
calculating a maximum number of child nodes based on a number of groups to group the encoded first sequence of information;
grouping the encoded first sequence of information into sets of leaf nodes, each set having a number of leaf nodes that are no more than the calculated maximum number of child nodes, wherein each leaf node includes an identification of a repeated value and a count of the repetition of the repeated value; and
generating at least one hierarchical node from the sets of leaf nodes wherein each of the hierarchical node at each level of hierarchy has a number of child nodes that are no more than the calculated maximum number of child nodes and maintains a sum of the count of the repetition of the repeated value of each descendant leaf node of the hierarchical node; and
generating a count index for the encoded first sequence of information based on the sets of leaf nodes and the hierarchical nodes by;
for each piece of information in the encoded sequence of information;
determining a first index location based on the number of groups, where the index location corresponds to a first hierarchical node;
setting the first hierarchical node as the parent node;
creating a pointer defining a relationship between the parent node and at least one leaf node, where the number of leaf nodes corresponds to the maximum number of child nodes for the first hierarchical node, thereby creating a subgroup of related leaf nodes within the sets of leaf nodes; and
updating the first index location to correspond to the next hierarchical node in the at least one hierarchical node in the set of leaf nodes; and
recursively generating the count index for each subgroup of related leaf nodes.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present invention relate to run-length encoded sequences and supporting efficient offset-based updates of values while allowing fast lookups. In an embodiment of the present invention, an indexing scheme is disclosed, herein called count indexes, that supports O(log n) offset-based updates and lookups on a run-length sequence with n runs. In an embodiment, count indexes of the present invention support O(log n) updates on bitmapped sequences of size n. Embodiments of the present invention can be generalize to be applied to block-oriented storage systems.
-
Citations
20 Claims
-
1. A computer-implemented method for performing database storage management, comprising:
-
encoding a first sequence of information comprising a compression of at least one successive repetition of values in the first sequence of information; calculating a maximum number of child nodes based on a number of groups to group the encoded first sequence of information; grouping the encoded first sequence of information into sets of leaf nodes, each set having a number of leaf nodes that are no more than the calculated maximum number of child nodes, wherein each leaf node includes an identification of a repeated value and a count of the repetition of the repeated value; and generating at least one hierarchical node from the sets of leaf nodes wherein each of the hierarchical node at each level of hierarchy has a number of child nodes that are no more than the calculated maximum number of child nodes and maintains a sum of the count of the repetition of the repeated value of each descendant leaf node of the hierarchical node; and generating a count index for the encoded first sequence of information based on the sets of leaf nodes and the hierarchical nodes by; for each piece of information in the encoded sequence of information; determining a first index location based on the number of groups, where the index location corresponds to a first hierarchical node; setting the first hierarchical node as the parent node; creating a pointer defining a relationship between the parent node and at least one leaf node, where the number of leaf nodes corresponds to the maximum number of child nodes for the first hierarchical node, thereby creating a subgroup of related leaf nodes within the sets of leaf nodes; and updating the first index location to correspond to the next hierarchical node in the at least one hierarchical node in the set of leaf nodes; and recursively generating the count index for each subgroup of related leaf nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer-readable medium including instructions that, when executed by a processing unit, cause the processing unit to perform database storage management, by performing the steps of:
-
encoding a first sequence of information comprising a compression of at least one successive repetition of values in the first sequence of information; calculating a maximum number of child nodes based on a number of groups to group the encoded first sequence of information; grouping the encoded first sequence of information into sets of leaf nodes, each set having a number of leaf nodes that are no more than the calculated maximum number of child nodes, wherein each leaf node includes an identification of a repeated value and a count of the repetition of the repeated value; and generating at least one hierarchical node from the sets of leaf nodes wherein each of the hierarchical node at each level of hierarchy has a number of child nodes that are no more than the calculated maximum number of child nodes and maintains a sum of the count of the repetition of the repeated value of each descendant leaf node of the hierarchical node; and generating a count index for the encoded first sequence of information based on the sets of leaf nodes and the hierarchical nodes by; for each piece of information in the encoded sequence of information; determining a first index location based on the number of groups, where the index location corresponds to a first hierarchical node; setting the first hierarchical node as the parent node; creating a pointer defining a relationship between the parent node and at least one leaf node, where the number of leaf nodes corresponds to the maximum number of child nodes for the first hierarchical node, thereby creating a subgroup of related leaf nodes within the sets of leaf nodes; and updating the first index location to correspond to the next hierarchical node in the at least one hierarchical node in the set of leaf nodes; and recursively generating the count index for each subgroup of related leaf nodes. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computing device comprising:
-
a data bus; a memory unit coupled to the data bus; a processing unit coupled to the data bus and directed to; encode a first sequence of information comprising a compression of at least one successive repetition of values in the first sequence of information; calculate a maximum number of child nodes based on a number of groups to group the encoded first sequence of information; group the encoded first sequence of information into sets of leaf nodes, each set having a number of leaf nodes that are no more than the calculated maximum number of child nodes, wherein each leaf node includes an identification of a repeated value and a count of the repetition of the repeated value; and generate at least one hierarchical node from the sets of leaf nodes wherein each of the hierarchical node at each level of hierarchy has a number of child nodes that are no more than the calculated maximum number of child nodes and maintains a sum of the count of the repetition of the repeated value of each descendant leaf node of the hierarchical node; and generate a count index for the encoded first sequence of information based on the sets of leaf nodes and the hierarchical nodes by; for each piece of information in the encoded sequence of information; determining a first index location based on the number of groups, where the index location corresponds to a first hierarchical node; setting the first hierarchical node as the parent node; creating a pointer defining a relationship between the parent node and at least one leaf node, where the number of leaf nodes corresponds to the maximum number of child nodes for the first hierarchical node, thereby creating a subgroup of related leaf nodes within the sets of leaf nodes; and updating the first index location to correspond to the next hierarchical node in the at least one hierarchical node in the set of leaf nodes; and recursively generating the count index for each subgroup of related leaf nodes.
-
Specification