Value-instance-connectivity computer-implemented database
First Claim
1. A system for storing a plurality of tuples, each tuple comprising at least a first attribute having a first attribute value and a second attribute having a second attribute value, the system comprising:
- a. a value store storing, for each of the plurality of tuples, a derived value derived from the tuple'"'"'s first attribute value and the tuple'"'"'s second attribute value;
b. an instance store identifying instances of the derived values in the value store associated with each tuple;
c. a cardinality store storing information representing frequencies of occurrence of instances of equal value, wherein a particular derived value in the value store associated with a particular instance in the instance store can be derived using the cardinality store.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer-implemented database and method providing an efficient, ordered reduced space representation of multi-dimensional data. The data values for each attribute are stored in a manner that provides an advantage in, for example, space usage and/or speed of access, such as in condensed form and/or sort order. Instances of each data value for an attribute are identified by instance elements, each of which is associated with one data value. Connectivity information is provided for each instance element that uniquely associates each instance element with a specific instance of a data value for another attribute. Low cardinality fields may be combined into a single field having values representing the various combinations of the original fields. In one embodiment, the “combined field” contains only instantiated combinations. In another embodiment, the combined field contains all values in the Cartesian product of the original fields, preferably in nested sort order. In yet another embodiment, the original fields are padded with dummy values so that their cardinalities are a power of two, causing each subfield in the combined field to fall on a bit boundary. In still another embodiment, containerization techniques are used to reduce the space required for representing the complete set of all possible values in the Cartesian product. of the original fields.
85 Citations
32 Claims
-
1. A system for storing a plurality of tuples, each tuple comprising at least a first attribute having a first attribute value and a second attribute having a second attribute value, the system comprising:
-
a. a value store storing, for each of the plurality of tuples, a derived value derived from the tuple'"'"'s first attribute value and the tuple'"'"'s second attribute value;
b. an instance store identifying instances of the derived values in the value store associated with each tuple;
c. a cardinality store storing information representing frequencies of occurrence of instances of equal value, wherein a particular derived value in the value store associated with a particular instance in the instance store can be derived using the cardinality store. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
one or more containers each storing information representing frequencies of occurrence of instances of particular derived values; - and
information regarding whether there are any instances of any of the derived values in at least one of the containers.
-
-
14. The system of claim 7 wherein the value store is implicit in and can be derived from the cardinality store.
-
15. The system of claim 7 wherein two or more of the derived values that are not derived from one or more dummy values form a regular sequence and further comprising a map mapping the derived values to an index in the regular sequence.
-
16. The system of claim 15 wherein the derived values comprise S bits of storage comprising Y bits of year information, M bits of month information, and D bits of day of the month information;
- and further comprising a map mapping the derived values in S bit storage to the number of days from a specific day in a time line.
-
17. A method for storing a plurality of tuples, each tuple comprising at least a first attribute having a first attribute value and a second attribute having a second attribute value, the method comprising the steps of:
-
a. for each of the plurality of tuples, deriving a derived value from the tuple'"'"'s first attribute value and the tuple'"'"'s second attribute value and storing the derived value in a value store;
b. storing in an instance store information identifying instances of the derived values in the value store associated with each tuple;
c. storing in a cardinality store information representing frequencies of occurrence of instances of equal value, wherein a particular derived value in the value store associated with a particular instance in the instance store can be derived using the cardinality store. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
storing the values of the first and second attributes of the plurality of tuples in the value store, associating an index with each of the values of the first and second attributes, and wherein, for one or more derived values, the step of deriving comprises concatenating a first index associated with the derived value'"'"'s first attribute and a second index associated with the derived value'"'"'s second attribute. -
21. The method of claim 20 wherein the index associated with each of the values is derived from the position of the value in the value store.
-
22. The method of claim 20 wherein the index associated with the each of the values is a hash value.
-
23. The method of claim 18 wherein (i) the first attribute has a cardinality equal to the number of unique first attribute values and (ii) the second attribute has a cardinality equal to the number of unique second attribute values and further comprising the steps of
storing zero or more first dummy values associated with the first attribute such that the total cardinality of the first attribute and dummy values is a power of two, storing zero or more second dummy values associated with the second attribute such that the total cardinality of the second attribute is a power of two, and storing in the value store a derived value, for each combination of a first dummy value and every second attribute value, including second dummy values, derived from the first dummy value and the second attribute value. -
24. The method of claim 23 wherein, for one or more derived values, the step of deriving comprises concatenating the first attribute value and the second attribute value.
-
25. The method of claim 23 further comprising the steps of
storing the values, including dummy values, of the first and second attributes of the plurality of tuples in the value store, associating an index with each of the values of the first and second attributes, and wherein, for one or more derived values, the step of deriving comprises concatenating a first index associated with the derived value'"'"'s first attribute and a second index associated with the derived value'"'"'s second attribute. -
26. The method of claim 25 wherein the index associated with each of the values is derived from the position of the value in the value store.
-
27. The method of claim 25 wherein the index associated with each of the values is a hash value.
-
28. The method of claim 26 wherein the first attribute values have positions in the value store that reflect a sort ordering of the first attribute values and the second attribute values have positions in the value store that reflect a sort ordering of the second attribute values.
-
29. The method of claim 23 wherein the step of storing information in a cardinality store further comprises the steps of
storing information representing frequencies of occurrence of instances of particular derived values in one or more containers; - and
storing information regarding whether there are any instances of any of the derived values in at least one of the containers.
- and
-
30. The method of claim 23 wherein the value store is implicit in and can be derived from the cardinality store and the step of storing information in the value store and the step of storing information in the cardinality store are the same step.
-
31. The method of claim 23 wherein two or more of the derived values that are not derived from one or more dummy values form a regular sequence and further comprising the step of mapping the derived values to an index in the regular sequence.
-
32. The method of claim 31 wherein the derived values comprise S bits of storage comprising Y bits of year information, M bits of month information, and D bits of day of the month information;
- and wherein the step of mapping further comprises mapping the derived values in S bit storage to the number of days from a specific day in a time line.
-
Specification