Dynamic streaming data dispatcher
First Claim
Patent Images
1. A method, comprising:
- receiving, by a computing device, a data stream from a source to be stored on a distributed data store;
distributing the data stream to a plurality of sinks on multiple hosts by sending a plurality of data tuples to the plurality of sinks, wherein each of the sinks on the multiple hosts are configured to buffer the data stream and wherein the distributed data store is configured to receive data from each of the multiple hosts, and wherein distributing the data stream to the plurality of sinks on multiple hosts comprises sending a data tuple from the plurality of data tuples to two or more of the plurality of sinks;
receiving load information indicating a load on at least one of the plurality of sinks and responsively adjusting the distribution of the data stream;
sending, by the computing device, to a first sink from the plurality of sinks, a control tuple, the control tuple comprising one of a write request or a discard request, the write request instructing the first sink to write data stored on the first sink to the distributed data store based on a buffer size and a load on the first sink, and the discard request instructing the first sink to discard the data stored on the first sink;
receiving a result tuple from the first sink indicating a success or failure of writing the data stream to the distributed data store; and
sending the control tuple to a second sink from the plurality of sinks in response to the result tuple from the first sink indicating a failure of writing the data stream to the distributed data store by the first sink.
1 Assignment
0 Petitions
Accused Products
Abstract
A method includes receiving, by a computing device, a plurality of data streams from plurality of sources, distributing the data streams to a plurality of sinks on multiple hosts, receiving load information indicating a load on at least one of the plurality of sinks and adjusting the distribution of the data stream accordingly and instructing the plurality of sinks to write the data streams to a distributed data store.
-
Citations
14 Claims
-
1. A method, comprising:
-
receiving, by a computing device, a data stream from a source to be stored on a distributed data store; distributing the data stream to a plurality of sinks on multiple hosts by sending a plurality of data tuples to the plurality of sinks, wherein each of the sinks on the multiple hosts are configured to buffer the data stream and wherein the distributed data store is configured to receive data from each of the multiple hosts, and wherein distributing the data stream to the plurality of sinks on multiple hosts comprises sending a data tuple from the plurality of data tuples to two or more of the plurality of sinks; receiving load information indicating a load on at least one of the plurality of sinks and responsively adjusting the distribution of the data stream; sending, by the computing device, to a first sink from the plurality of sinks, a control tuple, the control tuple comprising one of a write request or a discard request, the write request instructing the first sink to write data stored on the first sink to the distributed data store based on a buffer size and a load on the first sink, and the discard request instructing the first sink to discard the data stored on the first sink; receiving a result tuple from the first sink indicating a success or failure of writing the data stream to the distributed data store; and sending the control tuple to a second sink from the plurality of sinks in response to the result tuple from the first sink indicating a failure of writing the data stream to the distributed data store by the first sink. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method, comprising:
-
receiving, by a computing device, a plurality of data streams from a plurality of data sources to be stored on a distributed data store; distributing the plurality data stream to a plurality of sinks on multiple hosts by sending a plurality of data tuples to the plurality of sinks, wherein each of the sinks on the multiple hosts are configured to buffer the data stream and wherein the distributed data store is configured to receive data from each of the multiple hosts, wherein distributing the data stream to the plurality of sinks on multiple hosts comprises sending a data tuple from the plurality of data tuples to two or more of the plurality of sinks; receiving load information indicating a load on at least one of the plurality of sinks and responsively adjusting the distribution of the data stream; sending, by the computing device, to a first sink from the plurality of sinks, a control tuple, the control tuple comprising one of a write request or a discard request, the write request instructing the first sink to write data stored on the first sink to the distributed data store based on a buffer size and a load on the first sink, and the discard request instructing the first sink to discard the data stored on the first sink; receiving a result tuple from the first sink indicating a success or failure of writing the data stream to the distributed data store in response to the control tuple; and sending the control tuple to a second sink from the plurality of sinks in response to the result tuple from the first sink indicating a failure of writing the data stream to the distributed data store by the first sink. - View Dependent Claims (7, 8)
-
-
9. A computer program product for distributing incoming data streams, the computer program product comprising:
-
a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising; computer readable program code configured to; receive, by a computing device a plurality of data streams from a plurality of data sources to be stored on distributed data store; distribute the plurality data stream to a plurality of sinks on multiple hosts by sending a plurality of data tuples to the plurality of sinks, wherein each of the sinks on the multiple hosts are configured to buffer the data stream and wherein the distributed data store is configured to receive data from each of the multiple hosts, wherein distributing the data stream to the plurality of sinks on multiple hosts comprises sending a data tuple from the plurality of data tuples to two or more of the plurality of sinks; receive load information indicating a load on at least one of the plurality of sinks and responsively adjust the distribution of the data stream; send, by the computing device, to a first sink from the plurality of sinks, a control tuple, the control tuple comprising one of a write request or a discard request, the write request indicative for the first sink to write data stored on the first sink to the distributed data store based on a buffer size and a load on the first sink, and the discard request indicative for the first sink to discard the data stored on the first sink; receive a result tuple from the first sink indicating a success or failure of writing the data stream to the distributed data store in response to the control tuple; and send the control tuple to a second sink from the plurality of sinks in response to the result tuple from the first sink indicating a failure of writing the data stream to the distributed data store by the first sink. - View Dependent Claims (10)
-
-
11. A system, comprising:
-
a splitter for receiving a data stream from a data source; a plurality of sinks in operable communication with the splitter; a distributed data storage device in operable communication with the plurality of sinks; and a load manager in operable communication with the plurality of sinks and the splitter, wherein the load manager instructs the splitter on distribution of the data stream to the plurality of sinks and instructs the plurality of sinks to write data stored on the plurality of sinks to the distributed data store based on a buffer size and a load on the plurality of the sinks; and
wherein the splitter is configured to;split the data stream into a plurality portions; send a first data tuple from comprising a portion of the data stream to a first sink and a second sink, the first sink and the second sink being from the plurality of sinks; send a control tuple to the first sink, the control tuple comprising a write request for the first sink to write the portion of the data stream in the data tuple to the distributed data store; receive a result tuple from the first sink indicating a success or failure of writing the portion of the data stream to the distributed data store; and send the control tuple to a second sink from the plurality of sinks in response to the result tuple from the first sink indicating a failure of writing the data stream to the distributed data store by the first sink. - View Dependent Claims (12, 13, 14)
-
Specification