AGGREGATION OF ANCILLARY DATA ASSOCIATED WITH SOURCE DATA IN A SYSTEM OF NETWORKED COLLABORATIVE DATASETS
First Claim
1. A method comprising:
- receiving data representing a dataset having a data format into a dataset ingestion controller configured to form a collaborative dataset;
analyzing a subset of the data to determine dataset attributes;
generating descriptor data based on the dataset attributes associated with the subset of the data;
converting the dataset from the data format at a format converter to form an atomized dataset in a graph data arrangement, the atomized dataset being the collaborative dataset including atomized descriptor data and atomized source data;
associating a unit of the descriptor data to a corresponding unit of supra-descriptor data to form associations;
forming another graph data arrangement including the supra-descriptor data and the associations to the descriptor data,wherein the another graph data arrangement includes pointers to a plurality of atomized collaborative datasets.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments relate generally to data science and data analysis, computer software and systems, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby logic is configured to remediate anomalies in a data set originating in a first format prior to enrichment and conversion into a second format that facilitates forming collaborative dataset and, for example, interrelations among a system of networked collaborative datasets, whereby, at least in some implementations, data interrelations between different formats may be disposed in one or more data layers (e.g., layered data files and/or data arrangements). In some examples, a method may converting a dataset from a data format at a format converter to form an atomized dataset in a graph data arrangement, the atomized dataset being a collaborative dataset including atomized descriptor data and atomized source data.
74 Citations
16 Claims
-
1. A method comprising:
-
receiving data representing a dataset having a data format into a dataset ingestion controller configured to form a collaborative dataset; analyzing a subset of the data to determine dataset attributes; generating descriptor data based on the dataset attributes associated with the subset of the data; converting the dataset from the data format at a format converter to form an atomized dataset in a graph data arrangement, the atomized dataset being the collaborative dataset including atomized descriptor data and atomized source data; associating a unit of the descriptor data to a corresponding unit of supra-descriptor data to form associations; forming another graph data arrangement including the supra-descriptor data and the associations to the descriptor data, wherein the another graph data arrangement includes pointers to a plurality of atomized collaborative datasets. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 14, 15, 16)
-
-
12. An apparatus comprising:
-
a memory including executable instructions; and a processor, responsive to executing the instructions, is configured to; receive data representing a dataset having a data format into a dataset ingestion controller configured to form a collaborative dataset; analyze a subset of the data to determine dataset attributes; generate descriptor data based on the dataset attributes associated with the subset of the data; convert the dataset from the data format at a format converter to form an atomized dataset in a graph data arrangement, the atomized dataset being the collaborative dataset; associate a unit of the descriptor data to a unit of supra descriptor data to form associations; and form another graph data arrangement including the supra descriptor data and the associations to the descriptor data, wherein the another graph data arrangement includes pointers to a plurality of atomized collaborative datasets. - View Dependent Claims (13)
-
Specification