AUTOMATED DATABASE SCHEMA ANNOTATION
First Claim
1. A device comprising:
- a processor; and
a computer-readable medium including modules, the modules when executed by the processor, configure the device to generate annotations, the modules comprising;
a column discovery module configured to retrieve a table; and
a column annotation module configured to annotate a target column of a target table by;
determining a similarity between the target column of the target table and a column of the table, the similarity based at least in part on similarities between one or more values in the target column of the target table and one or more column values extracted from the column of the table; and
annotating, based at least in part on the similarity, the target column of the target table using a column identity of the column of the table.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques and constructs that improve annotating target columns of a target database by performing automated annotation of the target columns using sources. The techniques include calculating a similarity score between a target column and columns extracted from a table that is included in a source. The similarity score is calculated based at least in part on a similarity between a value in the target column of the target database and a column value of the extracted column from the table and on a similarity between an identity of the target column of the target database and column identities of the extracted columns from the table. In some examples, the techniques calculate similarity scores for one or more extracted columns and annotate the target column based on the similarity scores.
-
Citations
20 Claims
-
1. A device comprising:
-
a processor; and a computer-readable medium including modules, the modules when executed by the processor, configure the device to generate annotations, the modules comprising; a column discovery module configured to retrieve a table; and a column annotation module configured to annotate a target column of a target table by; determining a similarity between the target column of the target table and a column of the table, the similarity based at least in part on similarities between one or more values in the target column of the target table and one or more column values extracted from the column of the table; and annotating, based at least in part on the similarity, the target column of the target table using a column identity of the column of the table. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method comprising:
-
retrieving a table; determining a similarity between a target column of a target table and a column of the table, the similarity based at least in part on similarities between one or more values in the target column of the target table and one or more column values extracted from the column of the table; annotating, based at least in part on the similarity, the target column of the target table using a column identity of the column of the table; and storing the annotated target column. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-readable medium having computer-executable instructions to program a computer to perform operations comprising:
-
receiving a table; identifying a column included in the table; identifying a target column in a target table; and annotating the target column included in the target table using an identity of the column included in the table. - View Dependent Claims (18, 19, 20)
-
Specification