Deterministic data processing
First Claim
1. A method performed by data processing apparatus, the method comprising:
- receiving event data specifying a set of events that have occurred, the set of events including advertising impressions and user interactions with advertisements, the event data for each event including a timestamp indicative of a time at which the event occurred;
assigning the events to event bundles based on the timestamps, each event bundle containing events having timestamps that are within a pre-specified period of time;
creating event batches, each event batch including a pre-specified number of event bundles;
providing, during a first processing cycle in which one or more processing stages are performed, a first event batch to each of a first computing group and a second computing group, each computing group including one or more data processing apparatus, the first computing group being a computing group that is configured to perform operations of a first processing stage, the second computing group being a computing group that is configured to perform operations of a second processing stage;
determining that a threshold number of the event bundles in the first event batch have been processed by the first computing group;
in response to the determination, providing, during a second processing cycle in which one or more processing stages are performed, a second event batch to each of the first computing group and the second computing group;
determining that the threshold number of the event bundles in the first event batch and results from the first processing stage have been processed by the second computing group during the second processing cycle; and
providing, during a third processing cycle in which one or more processing stages are performed, a third event batch to each of the first computing group and the second computing group based on the determination that the threshold number of the event bundles in the first event batch and results from the first processing stage have been processed by the second computing group during the second processing cycle.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing event data. In one aspect, a method includes assigning events to event bundles based on timestamps associated with the events. Each event bundle contains events having timestamps that are within a pre-specified period of time. Event batches are created, where each event batch includes a pre-specified number of event bundles. A first event batch is provided to a first computing group and a second computing group. The first computing group is configured to perform a first processing stage, and the second computing group is configured to perform a second processing stage. A determination is made that a threshold number of the event bundles in the first event batch have been processed by the first computing group. In response to the determination, a second event batch is provided to each of the computing groups.
-
Citations
19 Claims
-
1. A method performed by data processing apparatus, the method comprising:
-
receiving event data specifying a set of events that have occurred, the set of events including advertising impressions and user interactions with advertisements, the event data for each event including a timestamp indicative of a time at which the event occurred; assigning the events to event bundles based on the timestamps, each event bundle containing events having timestamps that are within a pre-specified period of time; creating event batches, each event batch including a pre-specified number of event bundles; providing, during a first processing cycle in which one or more processing stages are performed, a first event batch to each of a first computing group and a second computing group, each computing group including one or more data processing apparatus, the first computing group being a computing group that is configured to perform operations of a first processing stage, the second computing group being a computing group that is configured to perform operations of a second processing stage; determining that a threshold number of the event bundles in the first event batch have been processed by the first computing group; in response to the determination, providing, during a second processing cycle in which one or more processing stages are performed, a second event batch to each of the first computing group and the second computing group; determining that the threshold number of the event bundles in the first event batch and results from the first processing stage have been processed by the second computing group during the second processing cycle; and providing, during a third processing cycle in which one or more processing stages are performed, a third event batch to each of the first computing group and the second computing group based on the determination that the threshold number of the event bundles in the first event batch and results from the first processing stage have been processed by the second computing group during the second processing cycle. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer storage medium encoded with a computer program, the program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations comprising:
-
receiving event data specifying a set of events that have occurred, the set of events including advertising impressions and user interactions with advertisements, the event data for each event including a timestamp indicative of a time at which the event occurred; assigning the events to event bundles based on the timestamps, each event bundle containing events having timestamps that are within a pre-specified period of time; creating event batches, each event batch including a pre-specified number of event bundles; providing, during a first processing cycle, a first event batch to each of a first computing group and a second computing group, each computing group including one or more data processing apparatus, the first computing group being a computing group that is configured to perform operations of the first processing stage, the second computing group being a computing group that is configured to perform operations of a second processing stage; determining that a threshold number of the event bundles in the first event batch have been processed by the first computing group; in response to the determination, during a second processing cycle, a second event batch to each of the first computing group and the second computing group; determining that the threshold number of the event bundles in the first event batch and results from the first processing stage have been processed by the second computing group during the second processing cycle; and providing, during a third processing cycle, a third event batch to each of the first computing group and the second computing group based on the determination that the threshold number of the event bundles in the first event batch and results from the first processing stage have been processed by the second computing group during the second processing cycle.
-
-
11. A system comprising:
-
a data store storing event data for a plurality of events specifying a set of events that have occurred, the set of events including advertising impressions and user interactions with advertisements, the event data for each event including a timestamp indicative of a time at which the event occurred; and an event processing apparatus configured to interact with the data store and to perform operations comprising; receiving event data; assigning the events to event bundles based on the timestamps, each event bundle containing events having timestamps that are within a pre-specified period of time; creating event batches, each event batch including a pre-specified number of event bundles; providing, during processing cycle in which one or processing stages are performed, a first event batch to each of a first computing group and a second computing group, each computing group including one or more data processing apparatus, the first computing group being a computing group that is configured to perform operations of a first processing stage, the second computing group being a computing group that is configured to perform operations of a second processing stage; determining that a threshold number of the event bundles in the first event batch have been processed by the first computing group; in response to the determination, during a second processing cycle in which one or more processing stages are performed, a second event batch to each of the first computing group and the second computing group; determining that the threshold number of the event bundles in the first event batch and results from the first processing stage have been processed by the second computing group during the second processing cycle; and providing, during a third processing cycle in which one or more processing stages are performed, a third event batch to each of the first computing group and the second computing group based on the determination that the threshold number of the event bundles in the first event batch and results from the first processing stage have been processed by the second computing group during the second processing cycle. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
Specification