Support for incrementally processing user defined aggregations in a data stream management system
First Claim
1. A method implemented in a computer for processing a plurality of streams of data, the method comprising:
- processing the plurality of streams, to execute thereon a plurality of continuous queries based on a global plan;
receiving a command to register a user-defined aggregation and an identification of a set of instructions comprising a function to be executed to perform the user-defined aggregation over multiple tuples of the data in messages received from the plurality of streams;
creating in a memory of the computer, a first structure comprising said identification of the set of instructions;
receiving a new continuous query comprising the user-defined aggregation;
during execution of the new continuous query;
creating in the memory, an instance of said function to be executed to perform the user-defined aggregation, by using the identification in the first structure in response to receipt of a message from the plurality of streams;
invoking within said instance created in response to receipt of the message, the function to perform the user-defined aggregation to process a tuple of the data in the message;
continuing to execute the new continuous query, by performing the invoking and skipping the creating, in response to receipt of an additional message from the plurality of streams; and
outputting from said computer, a stream based on results including a result generated based at least partially on processing of said tuple of the data by executing the new continuous query.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer is programmed to accept a command for creation of a new aggregation defined by a user to process data incrementally, one tuple at a time. One or more incremental function(s) in a set of instructions written by the user to implement the new aggregation maintain(s) locally any information that is to be passed between successive invocations, to support computing the aggregation for a given set of tuples as a whole. The user writes a set of instructions to perform the aggregation incrementally, including a plus function which is repeatedly invoked, only once, for each addition to a window of a message. The user also writes a minus function to be invoked with the message, to return the value of incremental aggregation over the window after removal of the message. In such embodiments, the computer does not maintain copies of messages in the window for use by aggregation function(s).
-
Citations
25 Claims
-
1. A method implemented in a computer for processing a plurality of streams of data, the method comprising:
-
processing the plurality of streams, to execute thereon a plurality of continuous queries based on a global plan; receiving a command to register a user-defined aggregation and an identification of a set of instructions comprising a function to be executed to perform the user-defined aggregation over multiple tuples of the data in messages received from the plurality of streams; creating in a memory of the computer, a first structure comprising said identification of the set of instructions; receiving a new continuous query comprising the user-defined aggregation; during execution of the new continuous query; creating in the memory, an instance of said function to be executed to perform the user-defined aggregation, by using the identification in the first structure in response to receipt of a message from the plurality of streams; invoking within said instance created in response to receipt of the message, the function to perform the user-defined aggregation to process a tuple of the data in the message; continuing to execute the new continuous query, by performing the invoking and skipping the creating, in response to receipt of an additional message from the plurality of streams; and outputting from said computer, a stream based on results including a result generated based at least partially on processing of said tuple of the data by executing the new continuous query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. One or more non-transitory computer-readable storage media comprising a plurality of instructions to be executed by a computer processing a plurality of streams of data, the instructions comprising:
-
instructions to process the plurality of streams, to execute thereon a plurality of continuous queries based on a global plan; instructions to receive; (A) a command to register a user-defined aggregation; and (B) an identification of a set of instructions comprising a function, to be executed to perform the user-defined aggregation over multiple tuples of the data in messages received from the plurality of streams; instructions to create in a computer memory, a first structure comprising said identification; instructions to receive a new continuous query comprising the user-defined aggregation; instructions to create in the memory, an instance of said function to be executed to perform the user-defined aggregation, by using the identification in the first structure in response to receipt of a message from the plurality of streams; instructions to invoke within the instance created in response to receipt of the message, the function to perform the user-defined aggregation to process a tuple of data in the message; instructions to continue to execute the new continuous query, by executing the instructions to invoke and skipping the instructions to create, in response to receipt of an additional message from the plurality of streams; and instructions to output from said computer, a stream generated based at least partially on processing of said tuple of the data by executing the new continuous query. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. An apparatus for processing a plurality of streams of data, the apparatus comprising:
-
means for processing the plurality of streams, to execute thereon a plurality of continuous queries based on a global plan; means, coupled to the means for processing, for receiving a command to register a user-defined aggregation and an identification of a set of instructions comprising a function to be executed to perform the user-defined aggregation over multiple tuples of the data in messages received from the plurality of streams; means, coupled to the means for processing, for creating in a memory a first structure comprising said identification of the set of instructions; means, coupled to the means for processing, for receiving a new continuous query comprising the user-defined aggregation; means for creating an instance of the function to be executed to perform the user-defined aggregation, by using the identification in the first structure in response to receipt of a message from the plurality of streams; means for invoking within said instance created in response to receipt of the message, the function to perform the user-defined aggregation to process a tuple of the data in the message; means for continuing to execute the new continuous query, by operating the means for the invoking and skipping operation of the means for creating, in response to receipt of an additional message from the plurality of streams; and means for outputting from said computer, a stream generated based at least partially on processing of said tuple of the data by executing the new continuous query. - View Dependent Claims (22, 23, 24, 25)
-
Specification