SCALABLE DISTRIBUTED PROCESSING OF RDF DATA
First Claim
1. A method comprising:
- receiving, with a database system, a query for a Resource Description Framework (RDF) database that stores a plurality of data chunks to one or more storage drives, wherein each of the plurality of data chunks includes a plurality of triples of the RDF database;
accessing an index that indexes one or more of the data chunks to identify a subset of the data chunks relevant to the query;
loading the subset of the data chunks to a main memory associated with the database system; and
executing the query only against triples included within the subset of the data chunks loaded to the main memory to obtain a query result.
1 Assignment
0 Petitions
Accused Products
Abstract
In general, techniques are described for an RDF (Resource Description Framework) database system which can scale to huge size for realistic data sets of practical interest. In some examples, a database system includes a Resource Description Framework (RDF) database that stores a plurality of data chunks to one or more storage drives, wherein each of the plurality of data chunks includes a plurality of triples of the RDF database. The database system also includes a working memory, a query interface that receives a query for the RDF database, a SPARQL engine that identifies a subset of the data chunks relevant to the query, and an index interface that includes one or more bulk loaders that load the subset of the data chunks to the working memory. The SPARQL engine executes the query only against triples included within the loaded subset of the data chunks to obtain a query result.
121 Citations
21 Claims
-
1. A method comprising:
-
receiving, with a database system, a query for a Resource Description Framework (RDF) database that stores a plurality of data chunks to one or more storage drives, wherein each of the plurality of data chunks includes a plurality of triples of the RDF database; accessing an index that indexes one or more of the data chunks to identify a subset of the data chunks relevant to the query; loading the subset of the data chunks to a main memory associated with the database system; and executing the query only against triples included within the subset of the data chunks loaded to the main memory to obtain a query result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18)
-
-
13. (canceled)
-
19. A database system comprising:
-
a Resource Description Framework (RDF) database that stores a plurality of data chunks to one or more storage drives, wherein each of the plurality of data chunks includes a plurality of triples of the RDF database; a working memory; a query interface that receives a query for the RDF database; a query parser/evaluator that accesses an index that indexes one or more of the data chunks to identify a subset of the data chunks relevant to the query; an index interface that includes one or more bulk loaders that load the subset of the data chunks to the working memory; and a SPARQL Protocol and RDF Query Language (SPARQL) engine that executes the query only against triples included within the subset of the data chunks loaded to the main memory to obtain a query result. - View Dependent Claims (20)
-
-
21. A computer-readable storage device comprising instructions for causing one or more programmable processors to:
-
receive, with a database system, a query for a Resource Description Framework (RDF) database that stores a plurality of data chunks to one or more storage drives, wherein each of the plurality of data chunks includes a plurality of triples of the RDF database; access an index that indexes one or more of the data chunks to identify a subset of the data chunks relevant to the query; load the subset of the data chunks to a main memory associated with the database system; and execute the query only against triples included within the subset of the data chunks loaded to the main memory to obtain a query result.
-
Specification