Query optimization for SPARQL
First Claim
1. A computer implemented method for efficiently planning and executing a SPARQL query, in any of the variants of the SPARQL query language, against a data source compliant with Resource Description Framework (RDF) standard, the method comprising:
- receiving a SPARQL query graph pattern;
converting the SPARQL query to a semQA2 query and creating a semQA2 parse tree by replacing syntax expression in the SPARQL query with algebraic operators, wherein LeftJoin operators in the SPARQL query are not eliminated;
performing filter pushdown on the semQA2 parse tree, including filters with LeftJoin operators, so as to produce filters as close to leaf nodes of the semQA2 parse tree as possible;
executing a query planning mechanism that uses semQA2 algebraic equivalences and filter pushdown to find multiple graph patterns equivalent to the semQA2 query;
executing a cost function that provides a cost of execution of each of the multiple graph patterns equivalent to the semQA2 query;
determining an efficient query plan for execution of the semQA2 query, by selecting, from the multiple graph patterns equivalent to the semQA2 query, a particular graph pattern with lowest cost of execution, according to the cost function; and
executing the particular graph pattern against a RDF data source and retrieving results compliant with the SPARQL specification.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to computer implemented methods and system for creating and executing an query plan for SPARQL Protocol And Query Language (SPARQL) queries. The methods and systems are designed to accept as input a query in SPARQL syntax, convert this query to semQA2 and generate a parse tree, perform filter pushdown, generate an efficient query plan potentially using a cost function, and execute this query plan against data sources complying to or modeled as Resource Description Framework (RDF). The result of these methods and of the systems implementing these methods is a set of triples contained in the data sources that comprise a solution of the SPARQL query provided.
-
Citations
5 Claims
-
1. A computer implemented method for efficiently planning and executing a SPARQL query, in any of the variants of the SPARQL query language, against a data source compliant with Resource Description Framework (RDF) standard, the method comprising:
-
receiving a SPARQL query graph pattern; converting the SPARQL query to a semQA2 query and creating a semQA2 parse tree by replacing syntax expression in the SPARQL query with algebraic operators, wherein LeftJoin operators in the SPARQL query are not eliminated; performing filter pushdown on the semQA2 parse tree, including filters with LeftJoin operators, so as to produce filters as close to leaf nodes of the semQA2 parse tree as possible; executing a query planning mechanism that uses semQA2 algebraic equivalences and filter pushdown to find multiple graph patterns equivalent to the semQA2 query; executing a cost function that provides a cost of execution of each of the multiple graph patterns equivalent to the semQA2 query; determining an efficient query plan for execution of the semQA2 query, by selecting, from the multiple graph patterns equivalent to the semQA2 query, a particular graph pattern with lowest cost of execution, according to the cost function; and executing the particular graph pattern against a RDF data source and retrieving results compliant with the SPARQL specification. - View Dependent Claims (2, 3, 4)
-
-
5. A data processing system for executing SPARQL queries, the system comprising:
-
a display device; and a processor configured to; receive a SPARQL query graph pattern; convert the SPARQL query to a semQA2 parse tree by replacing syntax expression in the SPARQL query with algebraic operators, wherein LeftJoin operators in the SPARQL query are not eliminated; perform filter pushdown on the semQA2 parse tree, including filters with LeftJoin operators, so as to produce filters as close to leaf nodes of the semQA2 parse tree as possible; execute a query planning mechanism that uses semQA2 algebraic equivalences and filter pushdown to find multiple graph patterns equivalent to the semQA2 query; execute a cost function that provides a cost of execution of each of the multiple graph patterns equivalent to the semQA2 query; determine an efficient query plan for execution of the semQA2 query, by selecting, from the multiple graph patterns equivalent to the semQA2 query, a particular graph pattern with lowest cost of execution, according to the cost function; and execute the query against a RDF data source and retrieving results compliant with the SPARQL specification.
-
Specification