Processing data feeds
First Claim
Patent Images
1. A computer-implemented method for processing of real-time data streams, the method comprising:
- receiving in a real-time manner, at a first worker process, a first input stream event in a first real-time input data stream comprising a plurality of stream eventsprocessing in a real-time manner, at the first worker process, the first input stream event in a first map operation to generate first intermediate output data;
transforming the first intermediate output data at the first worker process, to generate a first intermediate stream event associated with at least one slate that records a set of related stream events;
transmitting in a real-time manner, using the first worker process, the first intermediate stream event in a first real-time intermediate data stream;
receiving in a real-time manner, at a second worker process, the first intermediate stream event in the first real-time intermediate data stream;
processing in a real-time manner, at the second worker process, the first intermediate output data in the first intermediate stream event in a first update operation to generate first final output data; and
storing the first final output data in a first data structure associated with the first intermediate output data on a storage device.
2 Assignments
0 Petitions
Accused Products
Abstract
Exemplary embodiments allow performance of stream computations on real-time data streams using one or more map operations and/or one or more update operations. A map operation is a stream computation in which stream events in one or more real-time data streams are processed in a real-time manner to generate zero, one or more new stream events. An update operation is a stream computation in which stream events in one or more real-time data streams are processed in a real-time manner to create or update one or more static “slate” data structures that are stored in a durable manner.
187 Citations
26 Claims
-
1. A computer-implemented method for processing of real-time data streams, the method comprising:
-
receiving in a real-time manner, at a first worker process, a first input stream event in a first real-time input data stream comprising a plurality of stream events processing in a real-time manner, at the first worker process, the first input stream event in a first map operation to generate first intermediate output data; transforming the first intermediate output data at the first worker process, to generate a first intermediate stream event associated with at least one slate that records a set of related stream events; transmitting in a real-time manner, using the first worker process, the first intermediate stream event in a first real-time intermediate data stream; receiving in a real-time manner, at a second worker process, the first intermediate stream event in the first real-time intermediate data stream; processing in a real-time manner, at the second worker process, the first intermediate output data in the first intermediate stream event in a first update operation to generate first final output data; and storing the first final output data in a first data structure associated with the first intermediate output data on a storage device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A distributed computational system, comprising:
-
a computer-readable storage device for storing computer-executable code associated with a first map operation and a first update operation, and for storing static data output by the first update operation; a scheduling module for scheduling the first map operation to run on a first worker node and the first update operation to run on a second worker node; the first worker node programmed to; receive in a real-time manner a first input stream event in a first real-time input data stream comprising a plurality of stream events, run the first map operation to process in a real-time manner the first input stream event to generate first intermediate output data, generate a first intermediate stream event associated with at least one slate that records a set of related stream events, and transmit the first intermediate stream event in a first real-time intermediate data stream in a real-time manner; and the second worker node programmed to; receive in a real-time manner the first intermediate stream event in the first real-time intermediate data stream, run the first update operation to process in a real-time manner the first intermediate output data in the first intermediate stream event to generate first final output data, and store the first final output data in a first data structure associated with the first intermediate output data on the storage device. - View Dependent Claims (13, 14, 15)
-
-
16. One or more non-transitory computer-readable storage media having encoded thereon computer-executable instructions for performing a method for processing real-time data streams, the method comprising:
-
receiving in a real-time manner, at a first worker process, a first input stream event in a first real-time input data stream comprising a plurality of stream events; processing in a real-time manner, at the first worker process, the first input stream event in a first map operation to generate first intermediate output data; transforming the first intermediate output data, at the first worker process, to generate a first intermediate stream event associated with at least one slate that records a set of related stream events; transmitting in a real-time manner, using the first worker process, the first intermediate stream event in a first real-time intermediate data stream; receiving in a real-time manner, at a second worker process, the first intermediate stream event in the first real-time intermediate data stream; processing in a real-time manner, at the second worker process, the first intermediate output data in the first intermediate stream event in a first update operation to generate first final output data; and storing the first final output data in a first data structure associated with the first intermediate output data on a storage device. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A computer-implemented method for processing of real-time data streams, the method comprising:
-
receiving in a real-time manner, at a first worker process running on a first computational device, a first stream event in a first real-time data stream; processing in a real-time manner, at the first worker process at the first computational device, the first stream event in a first map operation to generate first output data; transforming the first output data, at the first worker process at the first computational device, to generate a second stream event associated with at least one slate that records a set of related stream events; and transmitting the second stream event in a real-time manner using the first worker process at the first computational device.
-
-
26. A computer-implemented method for processing of real-time data streams, the method comprising:
-
receiving in a real-time manner, at a first worker process running on a first computational device, a first stream event in a first real-time data stream, the first stream event comprising first input data; processing in a real-time manner, at the first worker process at the first computational device, the first input data contained in the first stream event in a first map operation to generate first output data; transforming the first output data, at the first worker process at the first computational device, to generate or update a first data structure associated with at least one slate that records a set of related stream events; and storing the first data structure on a durable storage device.
-
Specification