SYSTEMS AND METHODS FOR USING PROVENANCE INFORMATION FOR DATA RETENTION IN STREAM-PROCESSING
First Claim
1. A method for determining data usage based on provenance information, in a stream-processing system, the method comprising:
- progressively setting usage information for output stream data objects (SDOs);
determining input SDOs that an output SDO depends on, based on a provenance dependency function;
recursively feeding back the usage information for a subset of SDOs that can be discarded; and
discarding the subset of SDOs.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for determining data usage based on provenance information, in a stream-processing system, includes progressively setting usage information for output stream data objects (SDOs), determining input SDOs that an output SDO depends on, based on a provenance dependency function; recursively feeding back the usage information for a subset of SDOs that can be discarded; and discarding the subset of SDOs. A system and method for data retention based on usage information, in a stream-processing system, includes managing retention of SDOs by deleting SDOs that are determined to be of null usage; and enhancing retention characteristics of SDOs that are deemed to have usage.
79 Citations
21 Claims
-
1. A method for determining data usage based on provenance information, in a stream-processing system, the method comprising:
-
progressively setting usage information for output stream data objects (SDOs); determining input SDOs that an output SDO depends on, based on a provenance dependency function; recursively feeding back the usage information for a subset of SDOs that can be discarded; and discarding the subset of SDOs. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer readable medium comprising a computer readable program for determining data usage based on provenance information, in a stream-processing system, wherein the computer readable program when executed on a computer causes the computer to perform the steps of:
-
progressively setting usage information for output stream data objects (SDOs); determining input SDOs that an output SDO depends on, based on a provenance dependency function; recursively feeding back the usage information for a subset of SDOs that can be discarded; and discarding the subset of SDOs.
-
-
9. A method for data retention based on usage information, in a stream-processing system, comprising:
managing retention of stream data objects (SDOs) by; deleting SDOs that are determined to be of null usage; and enhancing retention characteristics of SDOs that are deemed to have usage. - View Dependent Claims (10, 11, 12, 13, 14)
-
15. A computer readable medium comprising a computer readable program for data retention based on usage information, in a stream-processing system, wherein the computer readable program when executed on a computer causes the computer to perform the steps of:
managing retention of stream data objects (SDOs) by; deleting SDOs that are determined to be of null usage; and enhancing retention characteristics of SDOs that are deemed to have usage.
-
16. A data management system for determining data to be retained in a data streaming environment, comprising:
-
a data usage manager configured to manage interactions between one or more processing elements, the data usage manager configured to compute an output count for downstream recipients of a data object from the one or more processing elements and to determine upstream processing elements that produced dependent input data objects, the data usage manager including; a provenance table configured to associate output ports of processing elements with provenance dependency functions for computing the dependent input data objects for the data object if the usage count is null; and an upstream notifier configured to notify upstream processing elements of a decrement to the usage count if the data object usage count is null, wherein the data manager discards the data object after the notification of upstream processing elements. - View Dependent Claims (17, 18, 19, 20, 21)
-
Specification