Automated latent star schema discovery tool
First Claim
1. A computer-performed method of discovering a latent organizational structure of a relational database, the method comprising:
- selecting, by a computer, a plurality of key candidates from a plurality of relational database tables, wherein the key candidates are selected using a first set of heuristic criteria;
selecting, by the computer, one or more fact table candidates in the plurality of relational database tables based on said fact table candidates containing key candidates from the plurality of key candidates;
selecting, by the computer, one or more dimension table candidates based on the plurality of key candidates and a second set of heuristic criteria; and
presenting, by the computer to a user of the computer, the one or more fact table candidates, the key candidates, and the one or more dimension table candidates as a star schema.
2 Assignments
0 Petitions
Accused Products
Abstract
A method, computer program product, and data processing system for computer-aided design of multidimensional data warehouse schemas are disclosed. A preferred embodiment of the present invention provides a software tool for identifying a latent star schema structure within an existing database. This software tool performs a heuristic analysis of the existing database schema to locate potential keys and measurement fields. Database tables within the existing schema are scored heuristically as to their suitability as fact tables based on the key candidates and measurement fields. For each fact table, other tables from the existing schema are identified as possible dimension tables. Data from the database is then used to test the suitability of the fact tables and dimension tables. The identified fact tables and their associated dimension tables are then reported to the user to reveal a basic star schema structure, which can be used as a basis for further design.
82 Citations
20 Claims
-
1. A computer-performed method of discovering a latent organizational structure of a relational database, the method comprising:
-
selecting, by a computer, a plurality of key candidates from a plurality of relational database tables, wherein the key candidates are selected using a first set of heuristic criteria; selecting, by the computer, one or more fact table candidates in the plurality of relational database tables based on said fact table candidates containing key candidates from the plurality of key candidates; selecting, by the computer, one or more dimension table candidates based on the plurality of key candidates and a second set of heuristic criteria; and presenting, by the computer to a user of the computer, the one or more fact table candidates, the key candidates, and the one or more dimension table candidates as a star schema. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product in one or more tangible computer-readable storage media, comprising functional descriptive material that, when executed by a computer, causes the computer to perform actions of:
-
selecting a plurality of key candidates from a plurality of relational database tables, wherein the key candidates are selected using a first set of heuristic criteria; selecting one or more fact table candidates in the plurality of relational database tables based on said fact table candidates containing key candidates from the plurality of key candidates; selecting one or more dimension table candidates based on the plurality of key candidates and a second set of heuristic criteria; and presenting, to a user of the computer, the one or more fact table candidates, the key candidates, and the one or more dimension table candidates as a star schema. - View Dependent Claims (9, 10, 11, 12, 13, 14, 20)
-
-
15. A data processing system comprising:
-
at least one processor; data storage accessible to the at least one processor; a set of instructions in the data storage, wherein the at least one processor executes the set of instructions to perform actions of selecting a plurality of key candidates from a plurality of relational database tables, wherein the key candidates are selected using a first set of heuristic criteria; selecting one or more fact table candidates in the plurality of relational database tables based on said fact table candidates containing key candidates from the plurality of key candidates; selecting one or more dimension table candidates based on the plurality of key candidates and a second set of heuristic criteria; and presenting, to a user of the data processing system, the one or more fact table candidates, the key candidates, and the one or more dimension table candidates as a star schema. - View Dependent Claims (16, 17, 18, 19)
-
Specification