Data flow windowing and triggering
First Claim
Patent Images
1. A method comprising:
- receiving, at data processing hardware, data corresponding to one of streaming data or batch data;
determining, using the data processing hardware, a content of the received data for computation;
determining, using the data processing hardware, an event time of the data for slicing the data;
determining a processing time to output results of the received data using the data processing hardware;
grouping, using the data processing hardware, a first subset of the received data into a first window, the first window defining a first sub-event time of the first data subset;
aggregating, using the data processing hardware, a first result of the first data subset for the first window; and
determining, by the data processing hardware, a first trigger time to;
emit the first aggregated result of the first data subset;
store a copy of the first aggregated result in a persistent state within memory hardware; and
refine a next aggregate result of a later subset with the first aggregated result,wherein the first trigger time comprises at least one of;
when a watermark reaches an end of the first window;
every threshold number of seconds of a walltime;
after receiving a punctuation record that terminates the first window;
every threshold number of records;
after arbitrary user logic decides to trigger;
orafter an arbitrary combination of concrete triggers.
2 Assignments
0 Petitions
Accused Products
Abstract
A method includes receiving data corresponding one of streaming data or batch data and a content of the received data for computation. The method also includes determining an event time of the data for slicing the data, determining a processing time to output results of the received data, and emitting at least a portion of the results of the received data based on the processing time and the event time.
29 Citations
22 Claims
-
1. A method comprising:
-
receiving, at data processing hardware, data corresponding to one of streaming data or batch data; determining, using the data processing hardware, a content of the received data for computation; determining, using the data processing hardware, an event time of the data for slicing the data; determining a processing time to output results of the received data using the data processing hardware; grouping, using the data processing hardware, a first subset of the received data into a first window, the first window defining a first sub-event time of the first data subset; aggregating, using the data processing hardware, a first result of the first data subset for the first window; and determining, by the data processing hardware, a first trigger time to; emit the first aggregated result of the first data subset; store a copy of the first aggregated result in a persistent state within memory hardware; and refine a next aggregate result of a later subset with the first aggregated result, wherein the first trigger time comprises at least one of; when a watermark reaches an end of the first window; every threshold number of seconds of a walltime; after receiving a punctuation record that terminates the first window; every threshold number of records; after arbitrary user logic decides to trigger;
orafter an arbitrary combination of concrete triggers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
data processing hardware; and memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising; receiving data corresponding to one of streaming data or batch data; determining a content of the received data for computation; determining an event time of the data for slicing the data; determining a processing time to output results of the received data; grouping a first subset of the received data into a first window, the first window defining a first sub-event time of the first data subset; aggregating a first result of the first data subset for the first window; determining a first trigger time to; emit the first aggregated result of the first data subset; store a copy of the first aggregated result in a persistent state within the memory hardware; and refine a next aggregate result of a later subset with the first aggregated result, wherein the first trigger time comprises at least one of; when a watermark reaches an end of the first window; every threshold number of seconds of a walltime; after receiving a punctuation record that terminates the first window; every threshold number of records; after arbitrary user logic decides to trigger;
orafter an arbitrary combination of concrete triggers. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification