Support for incrementally processing user defined aggregations in a data stream management system
First Claim
1. A method implemented in a computer of processing a plurality of streams of data, the method comprising:
- processing the plurality of streams, to execute thereon a plurality of continuous queries based on a global plan;
during the processing, receiving a command to create an aggregation and identification of a set of instructions comprising a function to be executed to perform the aggregation;
during the processing, creating in a memory of the computer, a first structure comprising the identification;
during the processing, receiving a new continuous query to be executed using the aggregation;
during the processing, based on the first structure, creating in the memory an operator comprising at least one count and at least one second structure, the count being initialized to an initial value, the second structure comprising a first field to hold a reference to the instance, and at least one additional field corresponding to at least one argument of the aggregation;
during the processing, modifying the global plan by adding thereto the operator, thereby to obtain a modified plan; and
altering the processing, to cause execution of the new continuous query in addition to the plurality of continuous queries, based on the modified plan;
during execution of the new continuous query;
if the count is at the initial value on receipt of a message, creating an instance of the set of instructions;
changing the count to a new value;
invoking the function in the instance to process a tuple of the data in the message, wherein the function is identified based at least on a first type of the message;
repeatedly performing the changing and the invoking, for each receipt of an additional message;
wherein if the additional message is of the first type, the invoking is performed only once; and
if the count is at the new value on receipt of yet another message of a second type, releasing memory occupied by the instance and changing the count to the initial value;
wherein the first type indicates addition of the message into a window and the second type indicates removal of the message from the window; and
outputting from said computer, a stream generated based at least partially on processing of said data by executing the new continuous query.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer is programmed to accept a command for creation of a new aggregation defined by a user to process data incrementally, one tuple at a time. One or more incremental function(s) in a set of instructions written by the user to implement the new aggregation maintain(s) locally any information that is to be passed between successive invocations, to support computing the aggregation for a given set of tuples as a whole. The user writes a set of instructions to perform the aggregation incrementally, including a plus function which is repeatedly invoked, only once, for each addition to a window of a message. The user also writes a minus function to be invoked with the message, to return the value of incremental aggregation over the window after removal of the message. In such embodiments, the computer does not maintain copies of messages in the window for use by aggregation function(s).
149 Citations
5 Claims
-
1. A method implemented in a computer of processing a plurality of streams of data, the method comprising:
-
processing the plurality of streams, to execute thereon a plurality of continuous queries based on a global plan; during the processing, receiving a command to create an aggregation and identification of a set of instructions comprising a function to be executed to perform the aggregation; during the processing, creating in a memory of the computer, a first structure comprising the identification; during the processing, receiving a new continuous query to be executed using the aggregation; during the processing, based on the first structure, creating in the memory an operator comprising at least one count and at least one second structure, the count being initialized to an initial value, the second structure comprising a first field to hold a reference to the instance, and at least one additional field corresponding to at least one argument of the aggregation; during the processing, modifying the global plan by adding thereto the operator, thereby to obtain a modified plan; and altering the processing, to cause execution of the new continuous query in addition to the plurality of continuous queries, based on the modified plan; during execution of the new continuous query; if the count is at the initial value on receipt of a message, creating an instance of the set of instructions; changing the count to a new value; invoking the function in the instance to process a tuple of the data in the message, wherein the function is identified based at least on a first type of the message; repeatedly performing the changing and the invoking, for each receipt of an additional message; wherein if the additional message is of the first type, the invoking is performed only once; and if the count is at the new value on receipt of yet another message of a second type, releasing memory occupied by the instance and changing the count to the initial value; wherein the first type indicates addition of the message into a window and the second type indicates removal of the message from the window; and outputting from said computer, a stream generated based at least partially on processing of said data by executing the new continuous query. - View Dependent Claims (2, 3, 4, 5)
-
Specification