Method and apparatus for data stream sampling
First Claim
1. A method for sampling a data stream comprising a plurality of tuples, the operator comprising:
- receiving one of said plurality of tuples, said one of said plurality of tuples belonging to a first sampling window;
associating said one of said plurality of tuples with a group, selected from a set of one or more groups, that reflects a subset of information relating to a sample of said data stream;
associating said one of said plurality of tuples with a supergroup, selected from a set of one or more supergroups, that reflects global information relating to said sample; and
applying one or more cleaning criteria to each of said one or more groups, if reception of said one of said plurality of tuples triggers a cleaning phase.
2 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, the present invention is a method and apparatus for data stream sampling. In one embodiment, a tuple of a data stream is received from a sampling window of the data stream. The tuple is associated with a group, selected from a set of one or more groups, which reflects a subset of information relating to a sample of the data stream. In addition, the tuple is associated with a supergroup, selected from a set of one or more supergroups, which reflects global information relating to the sample. It is then determined whether receipt of the tuple triggers a cleaning phase in which one or more tuples are shed from the sample. The operator can be implemented to execute a variety of different sampling algorithms, including well-known and experimental algorithms.
92 Citations
20 Claims
-
1. A method for sampling a data stream comprising a plurality of tuples, the operator comprising:
-
receiving one of said plurality of tuples, said one of said plurality of tuples belonging to a first sampling window;
associating said one of said plurality of tuples with a group, selected from a set of one or more groups, that reflects a subset of information relating to a sample of said data stream;
associating said one of said plurality of tuples with a supergroup, selected from a set of one or more supergroups, that reflects global information relating to said sample; and
applying one or more cleaning criteria to each of said one or more groups, if reception of said one of said plurality of tuples triggers a cleaning phase. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer readable medium containing an executable program for sampling a data stream comprising a plurality of tuples, where the program performs the steps of:
-
receiving one of said plurality of tuples, said one of said plurality of tuples belonging to a first sampling window;
associating said one of said plurality of tuples with a group, selected from a set of one or more groups, that reflects a subset of information relating to a sample of said data stream;
associating said one of said plurality of tuples with a supergroup, selected from a set of one or more supergroups, that reflects global information relating to said sample; and
applying one or more cleaning criteria to each of said one or more groups, if reception of said one of said plurality of tuples triggers a cleaning phase. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. An apparatus for sampling a data stream comprising a plurality of tuples, the apparatus comprising:
-
means for receiving one of said plurality of tuples, said one of said plurality of tuples belonging to a first sampling window;
means for associating said one of said plurality of tuples with a group, selected from a set of one or more groups, that reflects a subset of information relating to a sample of said data stream; and
means for associating said one of said plurality of tuples with a supergroup, selected from a set of one or more supergroups, that reflects global information relating to said sample; and
means for applying one or more cleaning criteria to each of said one or more groups, if reception of said one of said plurality of tuples triggers a cleaning phase. - View Dependent Claims (20)
-
Specification