Organizing datasets for adaptive responses to queries
First Claim
1. A method comprising:
- receiving new data values for storage with a dataset;
for each new data value, extracting attributes of the new data value, at least one attribute describing a pattern in the new data value;
identifying a schema for the dataset based on the determined attributes of the new data values;
adding the identified schema to a dataset lineage map describing a set of schemas for the dataset;
receiving a query from a client device for data values of the dataset, the query specifying a data value type; and
validating the query against the dataset lineage map, the validation comprising;
determining whether the dataset includes the data value type specified by the query by accessing the identified schema for the dataset.
1 Assignment
0 Petitions
Accused Products
Abstract
A dataset management system organizes datasets and tracks the changes to the dataset to adaptively respond to user queries. For a dataset, the dataset management system tracks the evolving schema of the dataset over time as new data values and/or updates to existing data values are incorporated into the dataset. When a query is received, the dataset management system accesses the schema of a dataset to understand how the dataset has changed over time. Given the changing schema of the dataset, the dataset management system can respond by providing recommendations as to suggested queries that can return improved results. As another option, the dataset management system can execute a query and return results that satisfy the query to the client device that provided the query.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving new data values for storage with a dataset; for each new data value, extracting attributes of the new data value, at least one attribute describing a pattern in the new data value; identifying a schema for the dataset based on the determined attributes of the new data values; adding the identified schema to a dataset lineage map describing a set of schemas for the dataset; receiving a query from a client device for data values of the dataset, the query specifying a data value type; and validating the query against the dataset lineage map, the validation comprising; determining whether the dataset includes the data value type specified by the query by accessing the identified schema for the dataset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory computer-readable storage medium comprising computer code that, when executed by a processor, causes the processor to:
-
receive new data values for storage with a dataset; for each new data value, extract attributes of the new data value, at least one attribute describing a pattern in the new data value; identify a schema for the dataset based on the determined attributes of the new data values; add the identified schema to a dataset lineage map describing a set of schemas for the dataset; receive a query from a client device for data values of the dataset, the query specifying a data value type; and validate the query against the dataset lineage map, wherein the computer code that causes the processor to validate the query further comprises computer code that, when executed by the processor, causes the processor to; determine whether the dataset includes the data value type specified by the query by accessing the identified schema for the dataset. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
Specification