Query generation for collaborative datasets
First Claim
1. A method comprising:
- receiving, into a collaborative dataset consolidation system, data representing a query of a consolidated dataset comprising a plurality of datasets formatted as plurality of atomized datasets;
analyzing the query to classify portions of the query to form classified query portions;
partitioning the query into a plurality of sub-queries as a function of a classification type associated with each of the classified query portions, at least one classified query portion being classified by a type of dataset;
determining a type of triple store loaded based on the type of dataset;
executing the query and the sub-queries as a federated query to one or more of the plurality of datasets using one or more triple stores stored in one or more distributed data repositories configured to be stored and accessed by the collaborative data consolidation system, each of the one or more triple stores being further formatted to query using a format associated with at least one of the one or more of the plurality of data sets, at least one of the one or more triple stores being the type of triple store; and
retrieving responsive data representing a query result returned in response to the query or sub-queries from at least one of the one or more distributed data repositories.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby a collaborative data layer and associated logic facilitate, for example, efficient access to, and implementation of, collaborative datasets. In some examples, a method may include receiving data representing a query of a consolidated dataset that may include datasets formatted atomized datasets, analyzing the query to classify portions of the query to form classified query portions, partitioning the query into sub-queries as a function of a classification type for each of the classified query portions, and retrieving data representing a query result from distributed data repositories.
210 Citations
19 Claims
-
1. A method comprising:
-
receiving, into a collaborative dataset consolidation system, data representing a query of a consolidated dataset comprising a plurality of datasets formatted as plurality of atomized datasets; analyzing the query to classify portions of the query to form classified query portions; partitioning the query into a plurality of sub-queries as a function of a classification type associated with each of the classified query portions, at least one classified query portion being classified by a type of dataset; determining a type of triple store loaded based on the type of dataset; executing the query and the sub-queries as a federated query to one or more of the plurality of datasets using one or more triple stores stored in one or more distributed data repositories configured to be stored and accessed by the collaborative data consolidation system, each of the one or more triple stores being further formatted to query using a format associated with at least one of the one or more of the plurality of data sets, at least one of the one or more triple stores being the type of triple store; and retrieving responsive data representing a query result returned in response to the query or sub-queries from at least one of the one or more distributed data repositories. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A collaborative dataset consolidation system, comprising:
-
a data store configured to receive, into the collaborative dataset consolidation system, data representing a query of a consolidated dataset comprising a plurality of datasets formatted as plurality of atomized datasets; and a dataset query engine configured to analyze the query to classify portions of the query to form classified query portions, to partition the query into a plurality of sub-queries as a function of a classification type associated with each of the classified query portions, at least one classified query portion being classified by a type of dataset, to determine a type of triple store loaded based on the type of dataset, to execute the query and the sub-queries as a federated query to one or more of the plurality of datasets using one or more triple stores stored in one or more distributed data repositories configured to be stored and accessed by the collaborative data consolidation system, each of the one or more triple stores being further formatted to query using a format associated with at least one of the one or more of the plurality of data sets, at least one of the one or more triple stores being the type of triple store, and to retrieve responsive data representing a query result returned in response to the query or sub-queries from at least one of the one or more distributed data repositories.
-
Specification