Constructing a data pipeline having scalability and low latency
First Claim
Patent Images
1. A method comprising:
- receiving event data of an event stream at an event data distributor of an event data distributor cluster of a plurality of event data distributor clusters, the plurality of event data distributor clusters comprising a primary event data distributor cluster and at least one non-primary event data distributor cluster, the plurality of event data distributor clusters being arranged in series, in a cascade configuration, and forming an event stream data pipeline for processing the event stream, and each event data distributor cluster distributing, to a plurality of event consumers associated with the event data distributor cluster, data resulting from the event stream data pipeline processing, each primary event data distributor cluster and each non-primary event data distributor cluster comprising a number of event data distributors, each event data distributor of the number comprising a computing device, the event stream comprising a plurality of events collected from online user behavior comprising online search, click and browse behavior received from a plurality of end user computing devices;
providing, by the event data distributor, a number of plug-in component interfaces to a plurality of computing devices of the plurality of event consumers, each interface corresponding to one of a number of plug-in components;
receiving, by the event data distributor from the number of plug-in component interfaces provided to the plurality of computing devices of the plurality of event consumers, a number of event specifications from the plurality of event consumers, each received event specification of the received event specifications corresponding to an event consumer of the plurality and identifying which collected events of the plurality are of interest to the event consumer, the one or more plug-in component interfaces comprising an interface of a partitioning plug-in component from which partitioning information to be processed by the partitioning plug-in component is received, the partitioning information specifying a manner in which the event stream is to be partitioned in accordance with interests of the plurality of event consumers, a number of partitions of the event stream comprising a partitioning of online user browser behavior for at least one geographic area that is of interest to at least one of the plurality of event consumers;
processing, by the event data distributor using the one or more plug-in components, the received event data of the event stream to identify, for each event consumer of the plurality of event consumers, data about each event of the plurality that is of interest to each event consumer of the plurality, the event data distributor processing the one or more plug-in components in an order determined by the event data distributor; and
sending, by the event data distributor over an electronic communications network to a computing device of each event consumer of the plurality of event consumers, the data about the one or more events of the plurality from the event stream in accordance with each event consumer'"'"'s interest, the data sent to an event consumer'"'"'s computing device resulting in the event consumer'"'"'s computing device processing the data sent to the event consumer'"'"'s computing device to identify at least one advertisement to present at one or more end user computing devices.
9 Assignments
0 Petitions
Accused Products
Abstract
A method and a system are provided for constructing a data pipeline having scalability and low latency. In one example, the system provides a primary data distributor cluster. The system provides one or more non-primary data distributor clusters. The system arranges a cascade configuration that includes the primary data distributor cluster and the one or more non-primary data distributor clusters.
11 Citations
16 Claims
-
1. A method comprising:
-
receiving event data of an event stream at an event data distributor of an event data distributor cluster of a plurality of event data distributor clusters, the plurality of event data distributor clusters comprising a primary event data distributor cluster and at least one non-primary event data distributor cluster, the plurality of event data distributor clusters being arranged in series, in a cascade configuration, and forming an event stream data pipeline for processing the event stream, and each event data distributor cluster distributing, to a plurality of event consumers associated with the event data distributor cluster, data resulting from the event stream data pipeline processing, each primary event data distributor cluster and each non-primary event data distributor cluster comprising a number of event data distributors, each event data distributor of the number comprising a computing device, the event stream comprising a plurality of events collected from online user behavior comprising online search, click and browse behavior received from a plurality of end user computing devices; providing, by the event data distributor, a number of plug-in component interfaces to a plurality of computing devices of the plurality of event consumers, each interface corresponding to one of a number of plug-in components; receiving, by the event data distributor from the number of plug-in component interfaces provided to the plurality of computing devices of the plurality of event consumers, a number of event specifications from the plurality of event consumers, each received event specification of the received event specifications corresponding to an event consumer of the plurality and identifying which collected events of the plurality are of interest to the event consumer, the one or more plug-in component interfaces comprising an interface of a partitioning plug-in component from which partitioning information to be processed by the partitioning plug-in component is received, the partitioning information specifying a manner in which the event stream is to be partitioned in accordance with interests of the plurality of event consumers, a number of partitions of the event stream comprising a partitioning of online user browser behavior for at least one geographic area that is of interest to at least one of the plurality of event consumers; processing, by the event data distributor using the one or more plug-in components, the received event data of the event stream to identify, for each event consumer of the plurality of event consumers, data about each event of the plurality that is of interest to each event consumer of the plurality, the event data distributor processing the one or more plug-in components in an order determined by the event data distributor; and sending, by the event data distributor over an electronic communications network to a computing device of each event consumer of the plurality of event consumers, the data about the one or more events of the plurality from the event stream in accordance with each event consumer'"'"'s interest, the data sent to an event consumer'"'"'s computing device resulting in the event consumer'"'"'s computing device processing the data sent to the event consumer'"'"'s computing device to identify at least one advertisement to present at one or more end user computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
an event data distributor of an event data distributor cluster of a plurality of event data distributor clusters, the plurality of event data distributor clusters comprising a primary event data distributor cluster and at least one non-primary event data distributor cluster, the plurality of event data distributor clusters being arranged in series, in a cascade configuration, and forming an event stream data pipeline for processing the event stream, and each event data distributor cluster distributing, to a plurality of event consumers associated with the event data distributor cluster, data resulting from the event stream data pipeline processing, each primary event data distributor cluster and each non-primary event data distributor cluster comprising a number of event data distributors, each event data distributor of the number comprising a computing device, the computing device comprising one or more processors and a storage medium for tangibly storing thereon program logic for execution by the one or more processors, the stored program logic comprising; receiving logic executed by the one or more processors for receiving event data of an event stream, the event stream comprising a plurality of events collected from online user behavior comprising online search, click and browse behavior received from a plurality of end user computing devices; providing logic executed by the one or more processors for providing a number of plug-in component interfaces to a plurality of computing devices of the plurality of event consumers, each interface corresponding to one of a number of plug-in components; receiving logic executed by the one or more processors for receiving, from the number of plug-in component interfaces of a number of plug-in components provided to the plurality of computing devices of the plurality of event consumers, a number of event specifications from the plurality of event consumers, each received event specification of the received event specifications corresponding to an event consumer of the plurality and identifying which collected events of the plurality are of interest to the event consumer, the one or more plug-in component interfaces comprising an interface of a partitioning plug-in component from which partitioning information to be processed by the partitioning plug-in component is received, the partitioning information specifying a manner in which the event stream is to be partitioned in accordance with interests of the plurality of event consumers, a number of partitions of the event stream comprising a partitioning of online user browse behavior for at least one geographic area that is of interest to at least one of the number of event consumers; processing logic executed by the one or more processors for processing, using the one or more plug-in components, the event data stream to identify, for each event consumer of the plurality of event consumers, data about each event of the plurality that is of interest to each event consumer of the plurality, the processing logic processing the one or more plug-in components in an order determined by the event data distributor; and sending logic executed by the one or more processors for sending, over an electronic communications network to a computing device of each event consumer of the plurality of event consumers, the data about the one or more events to the plurality from the event stream in accordance with each event consumer'"'"'s interest, the data sent to an event consumer'"'"'s computing device resulting in the event consumer'"'"'s computing device processing the data sent to the event consumer'"'"'s computing device to identify at least one advertisement to present at one or more end user computing devices. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
-
16. A non-transitory computer readable medium carrying one or more processor-executable instructions:
the instructions for an event data distributor of an event data distributor cluster of a plurality of event data distributor clusters, the plurality of event data distributor clusters comprising a primary event data distributor cluster and at least one non-primary event data distributor cluster, the plurality of event data distributor clusters being arranged in series, in a cascade configuration, and forming an event stream data pipeline for processing the event stream, and each event data distributor cluster distributing, to a plurality of event consumers associated with the event data distributor cluster, data resulting from the event stream data pipeline processing, each primary event data distributor cluster and each non-primary event data distributor cluster comprising a number of event data distributors, each event data distributor of the number comprising one or more processors for processing event streams, the instructions, when executed, cause an event data distributor'"'"'s one or more processors to; receive event data of an event stream, the event stream comprising a plurality of events collected from online user behavior comprising online search, click and browse behavior received from a plurality of end user computing devices; provide a number of plug-in component interfaces to a plurality of computing devices of a plurality of event consumers, each interface corresponding to one of a number of plug-in components; receive, from the number of plug-in component interfaces provided to the plurality of computing devices of the plurality of event consumers, a number of event specifications from the plurality of event consumers, each received event specification of the received event specifications corresponding to an event consumer of the plurality and identifying which collected events of the plurality are of interest to the event consumer, the one or more plug-in component interfaces comprising an interface of a partitioning plug-in component from which partitioning information to be processed by the partitioning plug-in component is received, the partitioning information specifying a manner in which the event stream is to be partitioned in accordance with interests of the plurality of event consumers, a number of partitions of the event stream comprising a partitioning of online user browse behavior for at least one geographic area that is of interest to at least one of the plurality of event consumers; process, using the one or more plug-in components, the event stream to identify, for each event consumer of the plurality of event consumers, data about each event of the plurality that is of interest to each event consumer of the plurality, the event data distributor processing the one or more plug-in components in an order determined by the event data distributor; and send, over an electronic communications network to a computing device of each event consumer of the plurality of event consumers, the data about the one or more events of the plurality from the event stream in accordance with each event consumer'"'"'s interest, the data sent to an event consumer'"'"'s computing device resulting in the event consumer'"'"'s computing device processing the data sent to the event consumer'"'"'s computing device to identify at least one advertisement to present to one or more end user computing devices.
Specification