PLATFORM MANAGEMENT OF INTEGRATED ACCESS OF PUBLIC AND PRIVATELY-ACCESSIBLE DATASETS UTILIZING FEDERATED QUERY GENERATION AND QUERY SCHEMA REWRITING OPTIMIZATION
First Claim
1. A method, comprising:
- receiving a query at a dataset access platform, the query being formatted according to a first data schema, the query comprising data associated with a request to access a dataset;
generating a copy of the query, the copy being identified as a master and configured to be stored in a datastore;
parsing the copy of the query in the first schema, the parsing being performed by an inference engine configured to identify the dataset, to infer an attribute associated with the query, and to generate one or more data links between the dataset and another dataset accessible by the dataset access platform;
rewriting, using a proxy server, the copy of the query in a second schema and, if the attribute indicates the query is configured to provide authentication data to access the dataset, the rewriting comprising converting the copy of the query into a triple and converting the attribute into another triple; and
optimizing the rewriting, the optimizing comprising identifying a database engine to execute the query and converting other data to a further triple, the other data and the further triple being associated with a path configured to route the query or the copy of the query from the dataset access platform to the dataset.
1 Assignment
0 Petitions
Accused Products
Abstract
Various techniques are described for platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization, including receiving at a dataset access platform a query formatted according to a first data schema, generating a copy of the query, saving the query and the copy to a datastore, parsing the copy of the query in the first schema using an inference engine, determining whether the query comprises data associated with an access control condition associated with accessing the dataset, the access control condition being configured to indicate whether the query is permitted to access the dataset, and rewriting, using a proxy server, the copy of the query in a second schema, and optimizing the rewriting by identifying a database engine to execute the query and including other data converted into another triple associated with an attribute of the query.
-
Citations
20 Claims
-
1. A method, comprising:
-
receiving a query at a dataset access platform, the query being formatted according to a first data schema, the query comprising data associated with a request to access a dataset; generating a copy of the query, the copy being identified as a master and configured to be stored in a datastore; parsing the copy of the query in the first schema, the parsing being performed by an inference engine configured to identify the dataset, to infer an attribute associated with the query, and to generate one or more data links between the dataset and another dataset accessible by the dataset access platform; rewriting, using a proxy server, the copy of the query in a second schema and, if the attribute indicates the query is configured to provide authentication data to access the dataset, the rewriting comprising converting the copy of the query into a triple and converting the attribute into another triple; and optimizing the rewriting, the optimizing comprising identifying a database engine to execute the query and converting other data to a further triple, the other data and the further triple being associated with a path configured to route the query or the copy of the query from the dataset access platform to the dataset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19)
-
-
13. A system, comprising:
-
a datastore configured to store data associated with a query and to store other data associated with a path configured to route the query or a copy of the query from a dataset access platform to a dataset; and a logic module configured to receive the query at the dataset access platform, the query being formatted according to a first data schema and comprising a request to access the dataset, to generate the copy of the query, the copy being identified as a master and configured to be stored in the datastore, to parse the copy of the query in the first schema using an inference engine, the inference engine being configured to identify the dataset, to infer an attribute associated with the query, and to generate one or more data links between the dataset and another dataset accessible by the dataset access platform, to rewrite, using a proxy server, the copy of the query in a second schema and, if the attribute indicates the query is configured to provide authentication data to access the dataset, to convert the copy of the query into a triple and to convert the attribute into another triple, and to optimize the rewrite to include identifying a database engine to execute the query and to convert the other data to a further triple.
-
-
20. A non-transitory computer readable medium having one or more computer program instructions configured to perform a method, the method comprising:
-
receiving a query at a dataset access platform, the query being formatted according to a first data schema, the query comprising data associated with a request to access a dataset; generating a copy of the query, the copy being identified as a master and configured to be stored in a datastore; parsing the copy of the query in the first schema, the parsing being performed by an inference engine configured to identify the dataset, to infer an attribute associated with the query, and to generate one or more data links between the dataset and another dataset accessible by the dataset access platform; rewriting, using a proxy server, the copy of the query in a second schema and, if the attribute indicates the query is configured to provide authentication data to access the dataset, the rewriting comprising converting the copy of the query into a triple and converting the attribute into another triple; and optimizing the rewriting, the optimizing comprising identifying a database engine to execute the query and converting other data to a further triple, the other data and the further triple being associated with a path configured to route the query or the copy of the query from the dataset access platform to the dataset.
-
Specification