Method and apparatus for learning probabilistic relational models having attribute and link uncertainty and for performing selectivity estimation using probabilistic relational models
First Claim
1. A method for estimating the selectivity of queries in a relational database, comprising the steps of:
- constructing a probabilistic relational model (PRM) from said database; and
performing online selectivity estimation for a particular query.
4 Assignments
0 Petitions
Accused Products
Abstract
The invention comprises a method and apparatus for learning probabilistic models (PRM'"'"'s) with attribute uncertainty. A PRM with attribute uncertainty defines a probability distribution over instantiations of a database. A learned PRM is useful for discovering interesting patterns and dependencies in the data. Unlike many existing techniques, the process is data-driven rather than hypothesis driven. This makes the technique particularly well-suited for exploratory data analysis. In addition, the invention comprises a method and apparatus for handling link uncertainty in PRM'"'"'s. Link uncertainty is uncertainty over which entities are related in our domain. The invention comprises of two mechanisms for modeling link uncertainty: reference uncertainty and existence uncertainty. The invention includes learning algorithms for each form of link uncertainty. The third component of the invention is a technique for performing database selectivity estimation using probabilistic relational models. The invention provides a unified framework for the estimation of query result size for a broad class of queries involving both select and join operations. A single learned model can be used to efficiently estimate query result sizes for a wide collection of potential queries across multiple tables.
264 Citations
15 Claims
-
1. A method for estimating the selectivity of queries in a relational database, comprising the steps of:
-
constructing a probabilistic relational model (PRM) from said database; and
performing online selectivity estimation for a particular query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for learning probabalistic relational models (PRM) having attribute uncertainty, comprising the steps of:
providing a parameter estimation task by;
inputting a relational schema that specifies a set of classes, having attributes associated with said classes and having relationships between objects in different classes;
providing a fully specified instance of said schema in the form of a training database; and
performing a structure learning task to extract an entire PRM solely from said training database. - View Dependent Claims (10, 11, 12, 14)
-
13. A method for learning probabalistic relational models having link uncertainty, comprising the steps of:
-
providing a mechanism for modeling link uncertainty; and
said mechanism computing sufficient statistics that include existence attributes without adding all nonexistent entities into a database.
-
-
15. A method for learning probabalistic relational models having link uncertainty, comprising the steps of:
-
providing a mechanism for modeling, link uncertainty; and
said mechanism computing sufficient statistics that include reference uncertainty, comprising the steps of;
fixing a set partition attributes ψ
[ρ
]; and
treating a variable Sρ
as any other attribute in a PRM;
wherein scoring success in predicting a value of said attribute given a value of its parents is performed using standard Bayesian methods.
-
Specification