Consolidator platform to implement collaborative datasets via distributed computer networks
First Claim
1. A system comprising:
- a data ingestion controller configured to receive multiple data files as differently-formatted datasets,wherein at least a subset of disparate data repositories include differently-formatted datasets from which one or more atomized datasets are generated and stored in one or more repositories, at least one of which includes a triplestore; and
a dataset query engine configured to receive data representing a query associated with a user account identifier to access a dataset, the dataset being associated with another user account identifier and stored in the triplestore, and to identify datasets relevant to the query, the datasets being disposed in disparate data repositories, the dataset query engine further configured to identify a level of authorization associated with the user account identifier to facilitate access by the query of a secured set of protected data in the dataset associated with the another user account identifier, to generate one or more sub-queries based on the query to transmit a sub-query via a network to access the disparate data repositories, the sub-query being configured to access the secured set of protected data in the dataset stored in the triplestore based on the level of authorization, to retrieve data representing query results from the accessed disparate data repositories, at least two of the disparate data repositories being associated with different entities, and togenerate a notification of execution of the query associated with the user account identifier to transmit the notification via the network for presentation as activity data in a user interface associated with the another user identifier to notify a user associated with the another user account identifier that another user associated with the user account identifier accessed the dataset to facilitate collaborative data-related activity among different entities,wherein the level of authorization associated with the user account identifier is configured to facilitate per-dataset authorization to provide access to the secured set of protected data, which is less than a total number of datasets associated with the another user account identifier,wherein the datasets comprise atomized datasets that include one or more subsets of linked data points.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby a collaborative data layer and associated logic facilitate, for example, efficient access to, and implementation of, collaborative datasets. In some examples, a system may include data ingestion controller configured to format datasets to form a first and a second atomized dataset, the second atomized dataset including the first atomized dataset and one or more other atomized datasets. The system may include a dataset query engine configured to identify a portion of a dataset relevant to a query, and to retrieve query results from at least one of different data repositories.
-
Citations
17 Claims
-
1. A system comprising:
-
a data ingestion controller configured to receive multiple data files as differently-formatted datasets, wherein at least a subset of disparate data repositories include differently-formatted datasets from which one or more atomized datasets are generated and stored in one or more repositories, at least one of which includes a triplestore; and a dataset query engine configured to receive data representing a query associated with a user account identifier to access a dataset, the dataset being associated with another user account identifier and stored in the triplestore, and to identify datasets relevant to the query, the datasets being disposed in disparate data repositories, the dataset query engine further configured to identify a level of authorization associated with the user account identifier to facilitate access by the query of a secured set of protected data in the dataset associated with the another user account identifier, to generate one or more sub-queries based on the query to transmit a sub-query via a network to access the disparate data repositories, the sub-query being configured to access the secured set of protected data in the dataset stored in the triplestore based on the level of authorization, to retrieve data representing query results from the accessed disparate data repositories, at least two of the disparate data repositories being associated with different entities, and to generate a notification of execution of the query associated with the user account identifier to transmit the notification via the network for presentation as activity data in a user interface associated with the another user identifier to notify a user associated with the another user account identifier that another user associated with the user account identifier accessed the dataset to facilitate collaborative data-related activity among different entities, wherein the level of authorization associated with the user account identifier is configured to facilitate per-dataset authorization to provide access to the secured set of protected data, which is less than a total number of datasets associated with the another user account identifier, wherein the datasets comprise atomized datasets that include one or more subsets of linked data points. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
a data ingestion controller configured to; receive a data file including a dataset, and to format the dataset to form an atomized dataset including atomized data points each including data representing at least two objects and an association between the two objects, the data ingestion controller is further configured to form another atomized dataset including the atomized dataset and other atomized datasets, wherein the data ingestion controller is configured to receive multiple data files, including the data file, as differently-formatted datasets, and format the differently-formatted datasets to form atomized datasets and stored in one or more repositories, at least one of which includes a triplestore, the another atomized dataset including data originating from the differently-formatted datasets, at least two of the differently-formatted datasets being associated with different entities; and a dataset query engine configured to receive data representing a query being associated with a user account identifier to access a dataset, the dataset being associated with another user account identifier and stored in the triplestore, the dataset query engine further configured to identify a subset of the another atomized dataset relevant to the query, wherein portions of the another atomized dataset are disposed in different data repositories storing data as the differently-formatted datasets, the dataset query engine also configured to identify a level of authorization associated with the user account identifier to facilitate access by the query of a secured set of protected data in the dataset associated with the another user account identifier, generate a plurality of sub-queries each of which is configured to access to transmit a sub-query via a network at least one of the different data repositories, the sub-query being configured to access the secured set of protected data in the dataset stored in the triplestore based on the level of authorization, to retrieve data representing query results via at least a portion of the atomized datasets from a subset of the different data repositories that store the data as the differently-formatted dataset, and to generate a notification of execution of the query associated with the user account identifier to transmit the notification via the network for presentation in a user interface associated with the another user identifier to notify a user associated with the another user account identifier that another user associated with the user account identifier accessed the dataset to facilitate collaborative data-related activity among different entities, wherein the level of authorization associated with the user account identifier is configured to facilitate per-dataset authorization to provide access to the secured set of protected data, which is less than a total number of datasets associated with the another user account identifier, wherein the datasets comprise atomized datasets that include one or more subsets of linked data points. - View Dependent Claims (13, 14, 15, 16, 17)
-
Specification