DISCOVERING TOPICAL STRUCTURES OF DATABASES
First Claim
1. A method for automatically discovering topical structures of at least one database comprising tables, said method comprising:
- computing various kinds of representations for said database based on schema information and data values of said database;
performing preliminary topical clustering of tables within said database to produce a plurality of clusterings, such that each of said clusterings corresponds to one representation;
aggregating results of said clusterings into a final clustering, such that said final clustering comprises a plurality of topical clusters;
identifying representative tables from said topical clusters in said final clustering, wherein at least one representative table is identified for each of said topical clusters in said final clustering;
arranging said representative tables by topic as a topical directory of said representative tables; and
outputting said topical directory.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for automatically discovering topical structures of databases includes a model builder adapted to compute various kinds of representations for the database based on schema information and data values of the database. A plurality of base clusterers is also provided, one for each representation. Each base clusterer is adapted to perform, for the representation, preliminary topical clustering of tables within the database to produce a plurality of clusters, such that each of the clusters corresponds to a set of tables on the same topic. A meta-clusterer aggregates results of the clusterers into a final clustering, such that the final clustering comprises a plurality of the clusters. A representative finder identifies representative tables from the clusters in the final clustering. The representative finder identifies at least one representative table for each of the clusters in the final clustering. The representative finder also arranges the representative tables by topic as a topical directory and outputs the topical directory.
-
Citations
20 Claims
-
1. A method for automatically discovering topical structures of at least one database comprising tables, said method comprising:
-
computing various kinds of representations for said database based on schema information and data values of said database; performing preliminary topical clustering of tables within said database to produce a plurality of clusterings, such that each of said clusterings corresponds to one representation; aggregating results of said clusterings into a final clustering, such that said final clustering comprises a plurality of topical clusters; identifying representative tables from said topical clusters in said final clustering, wherein at least one representative table is identified for each of said topical clusters in said final clustering; arranging said representative tables by topic as a topical directory of said representative tables; and outputting said topical directory. - View Dependent Claims (2, 3, 4, 5, 10)
-
-
6. A method for automatically discovering topical structures of at least one database comprising tables, said method comprising:
-
performing topical clustering of tables within said database to produce a plurality of clusterings; aggregating results of said clusterings into a final clustering, such that said final clustering comprises a plurality of said topical clusters; arranging said topical clusters by topic as a topical directory; and outputting said topical directory. - View Dependent Claims (7, 8, 9)
-
-
11. A computer program storage medium tangibly embodying a program of instructions executable by a computer to perform a method for automatically discovering topical structures of at least one database comprising tables, said method comprising:
-
computing various kinds of representations for said database based on schema information and data values of said database; performing preliminary topical clustering of tables within said database to produce a plurality of clusterings, such that each of said clusterings corresponds to one of said representations; aggregating results of said clusterings into a final clustering, such that said final clustering comprises a plurality of said topical clusters; identifying representative tables from said topical clusters in said final clustering, wherein at least one representative table is identified for each of said topical clusters in said final clustering; arranging said representative tables by topic as a topical directory of said representative tables; and outputting said topical directory. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A system for automatically discovering topical structures of at least one database comprising tables, said system comprising:
-
a model builder adapted to compute various kinds of representations for said database based on schema information and data values of said database; a plurality of base clusterers, wherein each base clusterer corresponds to one of said kinds of representations, and each base clusterer is adapted to perform, for a corresponding representation, preliminary topical clustering of tables within said database to produce a plurality of topical clusters, such that each of said topical clusters corresponds to a set of said tables on the same topic; a meta-clusterer adapted to aggregate results of said topical clusters into a final clustering, such that said final clustering comprises a plurality of said topical clusters; and a representative finder adapted to identify representative tables from said topical clusters in said final clustering, wherein at least one representative table is identified for each of said topical clusters in said final clustering, and wherein said representative finder is further adapted to arrange said representative tables by topic as a topical directory of said representative tables and output said topical directory. - View Dependent Claims (17, 18, 19, 20)
-
Specification