Method to create a partition-by time/tuple-based window in an event processing service
First Claim
1. A method comprising the steps of:
- receiving one or more continuous data streams that comprise a plurality of tuples of data, wherein each tuple, within the plurality of tuples of data, is associated with a particular time, wherein the particular time is no earlier than the time of any previously generated tuple in the plurality of tuples received;
retaining in partitions of a data structure the plurality of tuples that fall within a sliding time-based window, wherein the sliding time-based window is bounded by (a) a specified relative range of time and (b) a maximum number of tuples to retain;
wherein retaining in partitions of a data structure the plurality of tuples that fall within a sliding time-based window comprises;
in response to receiving a tuple from the one or more continuous data streams, performing;
(a) removing, from each particular partition of said partitions of said data structure, tuples that exceed a specified amount of tuples to be stored in each particular partition, wherein the partitions in the data structure are based upon one or more partition keys;
(b) removing, from each particular partition of said partitions of said data structure, tuples that are associated with a particular time that is not within the specified relative range of time; and
(c) storing, in a particular partition among the partitions of the data structure, the tuple; and
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
A method to create a partition by time/tuple based window in an event processing service is provided. When continuous data streams are received, tuples are stored in a data structure with partitions based upon partition keys. Only a specified amount of tuples may be stored in each partition. When a partition exceeds the specified number of tuples, the oldest tuples are removed from the data structure. Tuples stored beyond a specified time period are also removed from the data structure. Two data structures may also be used to implement a time/tuple based window. Tuples are stored in both a data structure with a partition by window and a data structure with a range window. Tuples are removed in the partition by window when tuples exceed the amount in the partition. Tuples are removed in the range window when tuples exceed a specified time period. The two data structures are synchronized.
76 Citations
22 Claims
-
1. A method comprising the steps of:
-
receiving one or more continuous data streams that comprise a plurality of tuples of data, wherein each tuple, within the plurality of tuples of data, is associated with a particular time, wherein the particular time is no earlier than the time of any previously generated tuple in the plurality of tuples received; retaining in partitions of a data structure the plurality of tuples that fall within a sliding time-based window, wherein the sliding time-based window is bounded by (a) a specified relative range of time and (b) a maximum number of tuples to retain; wherein retaining in partitions of a data structure the plurality of tuples that fall within a sliding time-based window comprises; in response to receiving a tuple from the one or more continuous data streams, performing; (a) removing, from each particular partition of said partitions of said data structure, tuples that exceed a specified amount of tuples to be stored in each particular partition, wherein the partitions in the data structure are based upon one or more partition keys; (b) removing, from each particular partition of said partitions of said data structure, tuples that are associated with a particular time that is not within the specified relative range of time; and (c) storing, in a particular partition among the partitions of the data structure, the tuple; and wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising the steps of:
-
receiving one or more continuous data streams that comprise a plurality of tuples of data, wherein each tuple, within the plurality of tuples of data, is associated with a particular time, wherein the particular time is no less than the time of any previously generated tuple in the plurality of tuples received; retaining in a data structure the plurality of tuples that fall within a sliding time-based window, wherein the sliding time-based window is bounded by (a) a specified relative range of time and (b) a maximum number of tuples to retain, wherein the data structure comprises a first data structure and a second data structure; wherein retaining in a data structure the plurality of tuples that fall within a sliding time-based window comprises; in response to receiving a tuple from the one or more continuous data streams, performing the steps of; (a) removing, from each particular partition of a first data structure, tuples that exceed a specified amount of tuples to be stored in each particular partition; (b) removing, from a second data structure, tuples that are associated with a particular time that is not within a range of time; (c) storing, the tuple, in the first data structure and the second data structure, wherein tuples stored in the first data structure are stored in a particular partition among partitions based upon one or more partition keys, and wherein tuples stored in the second data structure are sorted by the particular time of the tuples; and (d) synchronizing tuples stored in the first data structure with tuples stored in the second data structure; and wherein the method is performed by one or more computing devices. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. A computer-readable storage medium carrying one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform:
-
receiving one or more continuous data streams that comprise a plurality of tuples of data, wherein each tuple, within the plurality of tuples of data, is associated with a particular time, wherein the particular time is no earlier than the time of any previously generated tuple in the plurality of tuples received; retaining in partitions of a data structure the plurality of tuples that fall within a sliding time-based window, wherein the sliding time-based window is bounded by (a) a specified relative range of time and (b) a maximum number of tuples to retain; and wherein retaining in partitions of a data structure the plurality of tuples that fall within a sliding time-based window comprises; in response to receiving a tuple from the one or more continuous data streams, performing; (a) removing, from each particular partition of said partitions of said data structure, tuples that exceed a specified amount of tuples to be stored in each particular partition, wherein the partitions in the data structure are based upon one or more partition keys; (b) removing, from each particular partition of said partitions of said data structure, tuples that are associated with a particular time that is not within the specified relative range of time; and (c) storing, in a particular partition among the partitions of the data structure, the tuple. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A computer-readable storage medium carrying one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to:
-
receiving one or more continuous data streams that comprise a plurality of tuples of data, wherein each tuple, within the plurality of tuples of data, is associated with a particular time, wherein the particular time is no less than the time of any previously generated tuple in the plurality of tuples received; retaining in a data structure the plurality of tuples that fall within a sliding time-based window, wherein the sliding time-based window is bounded by (a) a specified relative range of time and (b) a maximum number of tuples to retain, wherein the data structure comprises a first data structure and a second data structure; and wherein retaining in a data structure the plurality of tuples that fall within a sliding time-based window comprises; in response to receiving a tuple from the one or more continuous data streams, performing the steps of; (a) removing, from each particular partition of a first data structure, tuples that exceed a specified amount of tuples to be stored in each particular partition; (b) removing, from a second data structure, tuples that are associated with a particular time that is not within a range of time; (c) storing, the tuple, in the first data structure and the second data structure, wherein tuples stored in the first data structure are stored in a particular partition among partitions based upon one or more partition keys, and wherein tuples stored in the second data structure are sorted by the particular time of the tuples; and (d) synchronizing tuples stored in the first data structure with tuples stored in the second data structure. - View Dependent Claims (18, 19, 20, 21, 22)
-
Specification