Ranking database query results
First Claim
1. A method comprising:
- calculating a global atomic quantity for each attribute value in a database, each global atomic quantity representing an unconditional importance level of its respective attribute value;
calculating a conditional atomic quantity for each attribute value in the database, each conditional atomic quantity representing a conditional importance level of an association between a pair of attribute values; and
ranking result tuples of a database query based on global atomic quantities and conditional atomic quantities.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and methods rank results of database queries. An automated approach for ranking database query results is disclosed that leverages data and workload statistics and associations. Ranking functions are based upon the principles of probabilistic models from Information Retrieval that are adapted for structured data. The ranking functions are encoded into an intermediate knowledge representation layer. The system is generic, as the ranking functions can be further customized for different applications. Benefits of the disclosed system and methods include the use of adapted probabilistic information retrieval (PIR) techniques that leverage relational/structured data, such as columns, to provide natural groupings of data values. This permits the inference and use of pair-wise associations between data values across columns, which are usually not possible with text data.
57 Citations
28 Claims
-
1. A method comprising:
-
calculating a global atomic quantity for each attribute value in a database, each global atomic quantity representing an unconditional importance level of its respective attribute value;
calculating a conditional atomic quantity for each attribute value in the database, each conditional atomic quantity representing a conditional importance level of an association between a pair of attribute values; and
ranking result tuples of a database query based on global atomic quantities and conditional atomic quantities. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A processor-readable medium comprising processor-executable instructions configured for:
-
computing atomic probabilities of attribute values in a database, the atomic probabilities computed according to p(y|W), p(y|D), p(x|y, w) and p(x|y,D), wherein x is a specified attribute value, y is an unspecified attribute value, W is a workload of the database, and D is data in the database; and
storing the atomic probabilities as atomic probabilities tables in an intermediate knowledge representation layer of the database. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A computer comprising:
-
a pre-processing component configured to compute a global atomic probability and a conditional atomic probability for each attribute value in a database; and
a query processing component configured to analyze a database query and to rank result tuples of the query based on tuple scores calculated from the atomic probabilities. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28)
-
Specification