EFFICIENT AND SCALABLE DATA EVOLUTION WITH COLUMN ORIENTED DATABASES
First Claim
Patent Images
1. A method comprising:
- decomposing a first table into a second table and a third table, the step of decomposing including;
re-using attributes of the first table in the third table;
generating data and bitmap indexes for the second table, the step of generating including;
locating a first tuple position in the first table for each of a plurality of unique values of a join attribute;
forming a filtering vector from a plurality of located positions of the unique values of the join attribute;
generating at least one of the data and bitmap indexes for the second table for a first attribute in the second table corresponding to a first attribute in the first table, andperforming a target bitmap index generation step, the target bitmap index generation step including;
computing a target bitmap index for each unique value of the first attribute in the second table as a corresponding source bitmap index filtered by the filter vector; and
generating target data for the first attribute in the second table; and
outputting the first table, the second table, and the third table, after decomposing the first table into the second table and the third table, in a visual form including at least one of displaying said tables on a display unit or printing said tables on a printer.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, system and program product for data evolution on column oriented databases is disclosed. For an input evolution operation, reusable and non-reusable attributes are identified. For attributes in a target schema that cannot be reused from the source schema, data and bitmap indexes of those attributes are generated from source data and bitmap indexes. A decompose operation is disclosed for decomposing a table into two tables. A merge operation is disclosed in which only one input table can be reused for mergence. A second merge operation is disclosed in which both input tables cannot be reused for mergence.
-
Citations
22 Claims
-
1. A method comprising:
decomposing a first table into a second table and a third table, the step of decomposing including; re-using attributes of the first table in the third table; generating data and bitmap indexes for the second table, the step of generating including; locating a first tuple position in the first table for each of a plurality of unique values of a join attribute; forming a filtering vector from a plurality of located positions of the unique values of the join attribute; generating at least one of the data and bitmap indexes for the second table for a first attribute in the second table corresponding to a first attribute in the first table, and performing a target bitmap index generation step, the target bitmap index generation step including; computing a target bitmap index for each unique value of the first attribute in the second table as a corresponding source bitmap index filtered by the filter vector; and generating target data for the first attribute in the second table; and outputting the first table, the second table, and the third table, after decomposing the first table into the second table and the third table, in a visual form including at least one of displaying said tables on a display unit or printing said tables on a printer. - View Dependent Claims (2, 3, 4, 5, 6)
-
7. A system comprising:
-
a computer including a computer processor for processing data evolution; a computer user interface for entering data for a first table and a second table; a data evolution engine for; merging the first table and the second table into a third table; re-using attributes from the first table in the third table; generating data and bitmap indexes for attributes in the third table not reused from the first table, and updating a target bitmap vector, the step of updating the target bitmap vector including; updating the target bitmap vector and target data of each non-key attribute in a first tuple of a plurality of tuples of the first table, according to a bitmap vector of a key attribute in the first tuple of the first table, and performing the updating of the target bitmap vector for the first tuple of the plurality of tuples in the first table and for at least one of the non-key attributes in the first tuple in the first table; and a computer display for outputting the first, second, and third tables after decomposing the first table into the second table and the third table in a visual form including displaying said tables on a display unit. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product for data evolution, the computer program product comprising:
-
a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code including; computer program code configured to merge a first table and a second table into a third table, wherein attributes of the first table and attributes of the second table are not reused in the third table; computer program code configured to perform a step to generate data and bitmap indexes for attributes in the third table, the step to generate data and bitmap indexes further including; computer program code configured to compute a number of occurrences of a first unique value of a key attribute in the first table and a first unique value of a key attribute in the second table; computer program code configured to generate a corresponding data and bitmap index for each tuple in the first table and for each non-key attribute of a first tuple in the first table, in which the first unique value of the key attribute in the first table occurs in the first table, for each of a plurality of unique values of the key attribute of the first table; and computer program code configured to generate a corresponding data and bitmap index for each tuple in the second table and for each non-key attribute of a first tuple in the second table, in which the first unique value of the key attribute occurs in the second table, for each of a plurality of unique values of the key attribute of the second table; and computer program code configured to output the first, second, and third tables after decomposing the first table into the second table and the third table in a visual form including on a display unit. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A method for use with a column-oriented database, comprising:
-
transforming first data from a first schema into a second schema, wherein; portions of both the first data and a first index of the first schema are identified that can be re-used in the second schema, and portions of both the first data and the first index of the first schema not included in the re-usable portions of the first data and the first index of the first schema are transformed directly into the second schema. - View Dependent Claims (20, 21, 22)
-
Specification