Techniques for identifying mergeable data
First Claim
1. A computer-implemented method for identifying correlated columns from database tables, comprising:
- determining correlation attributes for a first column and a second column from one or more database tables, the correlation attributes describing for each column at least one of the column and content of the column;
comparing the correlation attributes from the first and second column;
identifying similarities between the first and second column on the basis of the comparison;
on the basis of the identified similarities, determining whether the first and second column are correlated; and
merging the first and second columns only if the columns are determined to be correlated.
5 Assignments
0 Petitions
Accused Products
Abstract
A system, method and article of manufacture for identifying mergeable data in a data processing system and, more particularly, for identifying correlated columns from one or more database tables. One embodiment comprises determining correlation attributes for a first column and a second column from one or more database tables. The correlation attributes describe for each column at least one of the column and content of the column. The correlation attributes from the first and second column are compared and similarities between the first and second column are identified on the basis of the comparison. Then, on the basis of the identified similarities, it is determined whether the first and second columns are correlated. Only if the columns are determined to be correlated, the first and second columns are merged.
108 Citations
44 Claims
-
1. A computer-implemented method for identifying correlated columns from database tables, comprising:
-
determining correlation attributes for a first column and a second column from one or more database tables, the correlation attributes describing for each column at least one of the column and content of the column;
comparing the correlation attributes from the first and second column;
identifying similarities between the first and second column on the basis of the comparison;
on the basis of the identified similarities, determining whether the first and second column are correlated; and
merging the first and second columns only if the columns are determined to be correlated. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-implemented method for identifying correlated columns from database tables, comprising:
-
determining metadata for at least two columns from one or more database tables, the metadata describing characteristics of each column;
analyzing content from the at least two columns from the one or more database tables; and
determining a degree of correlation between the at least two columns using the determined metadata and the analyzed content. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A computer readable medium containing a program which, when executed, performs a process for identifying correlated columns from database tables, the process comprising:
-
determining correlation attributes for a first column and a second column from one or more database tables, the correlation attributes describing for each column at least one of the column and content of the column;
comparing the correlation attributes from the first and second column;
identifying similarities between the first and second column on the basis of the comparison;
on the basis of the identified similarities, determining whether the first and second column are correlated; and
merging the first and second columns only if the columns are determined to be correlated - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
-
-
38. A computer readable medium containing a program which, when executed, performs a process for identifying correlated columns from database tables, the process comprising:
-
determining metadata for at least two columns from one or more database tables, the metadata describing characteristics of each column;
analyzing content from the at least two columns from the one or more database tables; and
determining a degree of correlation between the at least two columns using the determined metadata and the analyzed content. - View Dependent Claims (39, 40, 41, 42)
-
-
43. A data processing system comprising:
-
at least one database having one or more database tables; and
a correlation manager for identifying correlated columns from the one or more database tables, the correlation manager being configured for;
determining correlation attributes for a first column and a second column from the one or more database tables, the correlation attributes describing for each column at least one of the column and content of the column;
comparing the correlation attributes from the first and second column;
identifying similarities between the first and second column on the basis of the comparison;
on the basis of the identified similarities, determining whether the first and second column are correlated; and
merging the first and second columns only if the columns are determined to be correlated.
-
-
44. A data processing system comprising:
-
at least one database having one or more database tables; and
a correlation manager for identifying correlated columns from the one or more database tables, the correlation manager being configured for;
determining metadata for at least two columns from the one or more database tables, the metadata describing characteristics of each column;
analyzing content from the at least two columns from the one or more database tables; and
determining a degree of correlation between the at least two columns using the determined metadata and the analyzed content.
-
Specification