Test data generation and scale up for database testing using unique common factor sequencing
First Claim
1. A method for test data generation using unique common factor sequencing, the method comprising:
- loading a first table for population with test data in a test data generation tool executing in a memory of a computer;
selecting a column set of multiple columns in the first table associated with a key to the first table;
assigning different cardinality sequence values to each column in the column set of multiple columns, wherein the cardinality sequence values do not share a common factor except for unity and each cardinality sequence value indicates a number of values in a sequence before the sequence repeats;
generating data for a specified number of rows of each column in the column set of multiple columns according to a corresponding one of the cardinality sequence values;
additionally generating random data for other columns of the first table without regard to any particular cardinality sequence value;
scaling up the first table to a new table of an original upper portion and an added lower portion by continuing into the added lower portion of the new table a sequence for each column in the column set of multiple columns based upon corresponding ones of the cardinality sequence values and based upon computing values of rows for each column in the column set of multiple columns in the added lower portion of the new table according to a formula;
modifiedRowValue=(L+V−
1%R)+1,where L is a last value in the first table for the column, V is an existing value for a row of the column in the new table, and R is a range for the column in the new table, while duplicating data in the other columns of the first table in the added lower portion of the new table; and
persisting the first table for use in database testing.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present invention provide a method, system and computer program product for test data generation using unique common factor sequencing. In an embodiment of the invention, a method for test data generation using unique common factor sequencing is provided. The method includes loading a table for population with test data in a test data generation tool executing in a memory of a computer. A column set of multiple columns in the table associated with a key to the table is selected for processing and different cardinality sequence values are assigned to the columns in the set such that the cardinality sequence values do not share a common factor except for unity as in the case of prime numbers.
-
Citations
18 Claims
-
1. A method for test data generation using unique common factor sequencing, the method comprising:
-
loading a first table for population with test data in a test data generation tool executing in a memory of a computer; selecting a column set of multiple columns in the first table associated with a key to the first table; assigning different cardinality sequence values to each column in the column set of multiple columns, wherein the cardinality sequence values do not share a common factor except for unity and each cardinality sequence value indicates a number of values in a sequence before the sequence repeats; generating data for a specified number of rows of each column in the column set of multiple columns according to a corresponding one of the cardinality sequence values; additionally generating random data for other columns of the first table without regard to any particular cardinality sequence value; scaling up the first table to a new table of an original upper portion and an added lower portion by continuing into the added lower portion of the new table a sequence for each column in the column set of multiple columns based upon corresponding ones of the cardinality sequence values and based upon computing values of rows for each column in the column set of multiple columns in the added lower portion of the new table according to a formula;
modifiedRowValue=(L+V−
1%R)+1,where L is a last value in the first table for the column, V is an existing value for a row of the column in the new table, and R is a range for the column in the new table, while duplicating data in the other columns of the first table in the added lower portion of the new table; and persisting the first table for use in database testing. - View Dependent Claims (2, 3, 4, 5, 6)
where L is a last value in the first table for the column, V is an existing value for a row of the column in the new table, M is a maximum value for the column in the new table, and R is a range for the column in the new table, while duplicating data in the other columns of the first table in the added lower portion of the new table.
-
-
6. The method of claim 2, further comprising:
-
identifying at least one column of the first table corresponding to a hash distribution key; and duplicating data from all columns of the first table that corresponds to the hash distribution key into the new table, but creating both the original upper portion and the added lower portion of the new table, each with a number of new rows as a prime number less than or equal to a product of ranges of the columns corresponding to the hash distribution key.
-
-
7. A test data generation data processing system comprising:
-
a host computer with at least one processor and a memory; a test data generator configured to generate test data executing in the memory of the host computer; and a unique common factor sequencing module coupled to the test data generator, the unique common factor sequencing module comprising program code configured to; select in a first table loaded for test data generation in the test data generator, a column set of multiple columns in the first table associated with a key to the first table; assign different cardinality sequence values to each column in the column set of multiple columns, wherein the cardinality sequence values do not share a common factor except for unity and each cardinality sequence value indicates a number of values in a sequence before the sequence before the sequence repeats; generate data for a specified number of rows of each column in the column set of multiple columns according to a corresponding one of the cardinality sequence values; additionally generate random data for other columns of the first table without regard to any particular cardinality sequence value; scale up the first table to a new table of an original upper portion and an added lower portion by continuing into the added lower portion of the new table a sequence for each column in the column set of multiple columns based upon corresponding ones of the cardinality sequence values and based upon computing values of rows for each column in the column set of multiple columns in the added lower portion of the new table according to a formula;
modifiedRowValue=(L+V−
1%R)+1,where L is a last value in the first table for the column, V is an existing value for a row of the column in the new table, and R is a range for the column in the new table, while duplicating data in the other columns of the first table in the added lower portion of the new table; and persist the first table for use in database testing. - View Dependent Claims (8, 9, 10, 11, 12)
where L is a last value in the first table for the column, V is an existing value for a row of the column in the new table, M is a maximum value for the column in the new table, and R is a range for the column in the new table, while duplicating data in the other columns of the first table in the added lower portion of the new table.
-
-
12. The system of claim 8, wherein the scaling up logic identifies at least one column of the first table corresponding to a hash distribution key, and duplicates data from all columns of the first table that corresponds to the hash distribution key into the new table, but creates both the original upper portion and the added lower portion of the new table, each with a number of new rows as a prime number less than or equal to a product of ranges of the columns corresponding to the hash distribution key.
-
13. A computer program product for test data generation using unique common factor sequencing, the computer program product comprising a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising:
-
computer readable program code for loading a first table for population with test data in a test data generation tool executing in a memory of a computer; computer readable program code for selecting a column set of multiple columns in the first table associated with a key to the first table; computer readable program code for assigning different cardinality sequence values to the columns in the column set of multiple columns, wherein the cardinality sequence values do not share a common factor except for unity and each cardinality sequence value indicates a number of values in a sequence before the sequence repeats; computer readable program code for generating data for the specified number of rows of each column in the column set of multiple columns according to a corresponding one of the cardinality sequence values; computer readable program code for additionally generating random data for other columns of the first table without regard to any particular cardinality sequence value; computer readable program code for scaling up the first table to a new table of an original upper portion and an added lower portion by continuing into the added lower portion of the new table a sequence for each column in the column set of multiple columns based upon corresponding ones of the cardinality sequence values and based upon computing values of rows for each column in the column set of multiple columns in the added lower portion of the new table according to a formula;
modifiedRowValue=(L+V−
1%R)+1,where L is a last value in the first table for the column, V is an existing value for a row of the column in the new table, and R is a range for the column in the new table, while duplicating data in the other columns of the first table in the added lower portion of the new table; and
,computer readable program code for persisting the table for use in database testing. - View Dependent Claims (14, 15, 16, 17, 18)
where L is a last value in the first table for the column, V is an existing value for a row of the column in the new table, M is a maximum value for the column in the new table, and R is a range for the column in the new table, while duplicating data in the other columns of the first table in the added lower portion of the new table.
-
-
18. The computer program product of claim 14, further comprising:
-
computer readable program code for identifying at least one column of the first table corresponding to a hash distribution key; and computer readable program code for duplicating data from all columns of the first table that corresponds to the hash distribution key into the new table, but creating both the original upper portion and the added lower portion of the new table, each with a number of new rows as a prime number less than or equal to a product of ranges of the columns corresponding to the hash distribution key.
-
Specification