PERFORMING A COMPUTATION USING PROVENANCE DATA
First Claim
Patent Images
1. A method comprising:
- storing first lineage data of a first dataset and provenance data of an application operating on the first dataset in a storage system;
determining, by a computing resource collocated within the storage system, that second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset;
in response to the determination, performing a computation on the second dataset using the provenance data of the application; and
generating an insight of the second dataset from the performed computation.
1 Assignment
0 Petitions
Accused Products
Abstract
Example implementations relate to performing computations using provenance data. An example implementation includes storing first lineage data of a first dataset and provenance data of an application operating on the first dataset in a storage system. A computing resource may determine whether second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset. A computation on the second dataset may be performed using the provenance data of the application, and an insight of the second dataset may be generated from the performed computation.
-
Citations
20 Claims
-
1. A method comprising:
-
storing first lineage data of a first dataset and provenance data of an application operating on the first dataset in a storage system; determining, by a computing resource collocated within the storage system, that second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset; in response to the determination, performing a computation on the second dataset using the provenance data of the application; and generating an insight of the second dataset from the performed computation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer readable medium having instructions executable by a processor to:
within a storage system, store first lineage data of a first dataset and provenance data of an application operating on the first dataset; predict that the application operating on the first dataset will attempt to operate on a second dataset by determining that second lineage data of the second dataset meets a similarity criterion with the first lineage data of the first dataset; in response to the prediction, perform a computation on the second dataset using the provenance data of the application, wherein the computation comprises operating on multiple data sample portions of the second dataset; identify a data sample portion of the multiple data sample portions that satisfies a second criterion; and store an insight of the second dataset generated from the performed computation, wherein the insight is the output of the performed computation on the identified data sample portion. - View Dependent Claims (14, 15, 16, 17, 18)
-
19. A storage system comprising:
-
a storage resource for storing first lineage data of a first dataset and provenance data of an application operating on the first dataset; a processor; and a non-transitory machine-readable storage medium comprising instructions executable by the processor to; predict that the application operating on the first dataset will attempt to operate on a second dataset by determining that second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset; in response to the prediction, perform a proactive computation on the second dataset using the provenance data of the application; and store an insight of the second dataset generated from the performed computation. - View Dependent Claims (20)
-
Specification