Performing a computation using provenance data
First Claim
Patent Images
1. A method comprising:
- storing first lineage data of a first dataset and provenance data of an application at an application layer operating on the first dataset in a storage system;
determining, by a computing resource collocated within the storage system, that second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset;
in response to the determination, performing a proactive computation on the second dataset using the provenance data of the application, wherein the proactive computation is at least a partial processing of the second dataset for generating an insight performed without being prompted to do so by a request from the application layer; and
generating the insight of the second dataset from the performed computation.
1 Assignment
0 Petitions
Accused Products
Abstract
Example implementations relate to performing computations using provenance data. An example implementation includes storing first lineage data of a first dataset and provenance data of an application operating on the first dataset in a storage system. A computing resource may determine whether second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset. A computation on the second dataset may be performed using the provenance data of the application, and an insight of the second dataset may be generated from the performed computation.
-
Citations
20 Claims
-
1. A method comprising:
-
storing first lineage data of a first dataset and provenance data of an application at an application layer operating on the first dataset in a storage system; determining, by a computing resource collocated within the storage system, that second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset; in response to the determination, performing a proactive computation on the second dataset using the provenance data of the application, wherein the proactive computation is at least a partial processing of the second dataset for generating an insight performed without being prompted to do so by a request from the application layer; and generating the insight of the second dataset from the performed computation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer readable medium having instructions executable by a processor to:
within a storage system, store first lineage data of a first dataset and provenance data of an application operating on the first dataset; predict that the application operating on the first dataset will attempt to operate on a second dataset by determining that second lineage data of the second dataset meets a similarity criterion with the first lineage data of the first dataset; in response to the prediction, perform a proactive computation on the second dataset using the provenance data of the application, wherein the computation comprises operating on multiple data sample portions of the second dataset, and wherein the proactive computation is at least a partial processing of the second dataset for generating an insight performed without being prompted to do so by a request from the application; identify a data sample portion of the multiple data sample portions that satisfies a second criterion; and store the insight of the second dataset generated from the performed computation, wherein the insight is the output of the performed computation on the identified data sample portion. - View Dependent Claims (14, 15, 16, 17, 18)
-
19. A storage system comprising:
-
a storage resource for storing first lineage data of a first dataset and provenance data of an application at an application layer operating on the first dataset; a processor; and a non-transitory machine-readable storage medium comprising instructions executable by the processor to; predict that the application operating on the first dataset will attempt to operate on a second dataset by determining that second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset; in response to the prediction, perform a proactive computation on the second dataset using the provenance data of the application, wherein the proactive computation is at least a partial processing of the second dataset for generating an insight performed without being prompted to do so by a request from the application layer; and store the insight of the second dataset generated from the performed computation. - View Dependent Claims (20)
-
Specification