Method and apparatus for optimizing the evaluation of semantic web queries
First Claim
1. A method for optimizing semantic web queries, the method comprising:
- receiving a semantic query over a database, the semantic query comprising a plurality of triple patterns;
creating a data flow graph that captures inter-relationships resulting from sharing of at least one of common variables and constraints among the plurality of triple patterns;
using the data flow graph and costs estimates for traversing all triple patterns in the semantic query to determine an order of optimization of the plurality of triple patterns and a query plan for executing the semantic query; and
using the query plan to evaluate the semantic query over the database;
wherein the data flow graph comprises a plurality of triple pattern and access method pair nodes connected by a plurality of edges representing dependencies among execution of the triple patterns in the semantic query; and
wherein each edge in the data flow graph comprises a cost associated with evaluating a given triple pattern by a given access method with respect to statistics of the database for a triple pattern and access pair node associated with that edge; and
the step of using the data flow graph and costs estimates further comprises determining the order of optimization of the plurality of triple patterns and the query plan for executing the semantic query such that edges in an optimal flow tree through the data flow graph comprise a minimum cumulative cost.
1 Assignment
0 Petitions
Accused Products
Abstract
A semantic query over an RDF database is received with RDF database statistics and access methods for evaluating triple patterns in the query. The semantic query is expressed as a parse tree containing triple patterns and logical relationships among the triple patterns. The parse tree and access methods create a data flow graph containing a plurality of triple pattern and access method pair nodes connected by a plurality of edges, and an optimal flow tree through the data flow graph is determined such that costs are minimized and all triple patterns in the semantic query are contained in the optimal flow tree. A structure independent execution tree defining a sequence of evaluation through the optimal flow tree is created and is transformed into a database structure dependent query plan. This is used to create an SQL query that is used to evaluate the semantic query over the RDF database.
-
Citations
16 Claims
-
1. A method for optimizing semantic web queries, the method comprising:
-
receiving a semantic query over a database, the semantic query comprising a plurality of triple patterns; creating a data flow graph that captures inter-relationships resulting from sharing of at least one of common variables and constraints among the plurality of triple patterns; using the data flow graph and costs estimates for traversing all triple patterns in the semantic query to determine an order of optimization of the plurality of triple patterns and a query plan for executing the semantic query; and using the query plan to evaluate the semantic query over the database; wherein the data flow graph comprises a plurality of triple pattern and access method pair nodes connected by a plurality of edges representing dependencies among execution of the triple patterns in the semantic query; and wherein each edge in the data flow graph comprises a cost associated with evaluating a given triple pattern by a given access method with respect to statistics of the database for a triple pattern and access pair node associated with that edge; and
the step of using the data flow graph and costs estimates further comprises determining the order of optimization of the plurality of triple patterns and the query plan for executing the semantic query such that edges in an optimal flow tree through the data flow graph comprise a minimum cumulative cost. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-readable storage medium containing a computer-readable code that when read by a computer causes the computer to perform a method for optimizing semantic web queries, the method comprising:
-
receiving a semantic query over a database, the semantic query comprising a plurality of triple patterns; creating a data flow graph that captures inter-relationships resulting from sharing of at least one of common variables and constraints among the plurality of triple patterns; using the data flow graph and costs estimates for traversing all triple patterns in the semantic query to determine an order of optimization of the plurality of triple patterns and a query plan for executing the semantic query; and using the query plan to evaluate the semantic query over the database; wherein the data flow graph comprises a plurality of triple pattern and access method pair nodes connected by a plurality of edges representing dependencies among execution of the triple patterns in the semantic query, each edge in the data flow graph comprises a cost associated with evaluating a given triple pattern by a given access method with respect to statistics of the database for a triple pattern and access pair node associated with that edge and the step of using the data flow graph and costs estimates further comprises determining the order of optimization of the plurality of triple patterns and the query plan for executing the semantic query such that edges in an optimal flow tree through the data flow graph comprise a minimum cumulative cost. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification