Distributing and processing streams over one or more networks
First Claim
Patent Images
1. A method comprising:
- receiving, at a unified data processing node, a continuous query;
determining a parallel portion of the continuous query;
sending the parallel portion to a plurality of distributed data processing nodes comprising at least a first distributed data processing node in a first data center and a second distributed data processing node in a second data center, wherein the first data center and the second data center are separate;
the first distributed data processing node locally processing the parallel portion against a first independent data stream to produce a first partial summary data, and sending the first partial summary data to the unified data processing node, wherein the first independent data stream is not duplicated or processed by the unified data processing node or the second distributed data processing node;
the second distributed data processing node locally processing the parallel portion against a second independent data stream to produce a second partial summary data, and sending the second partial summary data to the unified data processing node, wherein the second independent data stream is not duplicated or processed by the unified data processing node or the first distributed data processing node;
continuously receiving, at the unified data processing node, in real-time, streaming data results comprising the first partial summary data and the second partial summary data and combining the first partial summary data and the second partial summary data in preparation for returning combined streaming query results to an application; and
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
In an embodiment, a method for distributing and processing streams over wide area networks comprises receiving, at a unified data processing node, a continuous query; determining a parallel portion of the continuous query; sending the parallel portion to a plurality of distributed data processing nodes located in a plurality of data centers; at each distributed node in the plurality of distributed nodes, locally executing the parallel portion against independent data partitions, producing a partial summary data, sending the partial summary data to the unified node; continuously receiving, at the unified node, in real-time, the partial summary data.
-
Citations
25 Claims
-
1. A method comprising:
-
receiving, at a unified data processing node, a continuous query; determining a parallel portion of the continuous query; sending the parallel portion to a plurality of distributed data processing nodes comprising at least a first distributed data processing node in a first data center and a second distributed data processing node in a second data center, wherein the first data center and the second data center are separate; the first distributed data processing node locally processing the parallel portion against a first independent data stream to produce a first partial summary data, and sending the first partial summary data to the unified data processing node, wherein the first independent data stream is not duplicated or processed by the unified data processing node or the second distributed data processing node; the second distributed data processing node locally processing the parallel portion against a second independent data stream to produce a second partial summary data, and sending the second partial summary data to the unified data processing node, wherein the second independent data stream is not duplicated or processed by the unified data processing node or the first distributed data processing node; continuously receiving, at the unified data processing node, in real-time, streaming data results comprising the first partial summary data and the second partial summary data and combining the first partial summary data and the second partial summary data in preparation for returning combined streaming query results to an application; and wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a unified node comprising a processor; a plurality of distributed data processing nodes, each comprising a processor, the plurality of distributed data processing nodes comprising at least a first distributed data processing node in a first data center and a second distributed data processing node in a second data center, wherein the first data center and the second data center are separate; wherein the unified node is configured to; receive a continuous query; determine a parallel portion of the continuous query; send the parallel portion to at least the first and second distributed data processing nodes of the plurality of distributed data processing nodes; and receive streaming data results comprising a first partial summary data and a second partial summary data in real-time and combine the first partial summary data and the second partial summary data in preparation for returning combined streaming query results to an application; and wherein the first distributed data processing node in the plurality of distributed data processing nodes is configured to;
processing the parallel portion against a first independent data stream to produce the first partial summary data, wherein the first independent data stream is not duplicated or processed by the unified data processing node or the second distributed data processing node; and
send the first partial summary data to the unified node;wherein the second distributed data processing node in the plurality of distributed data processing nodes is configured to;
process the parallel portion against a second independent data stream to produce the second partial summary data, wherein the second independent data stream is not duplicated or processed by the unified data processing node or the first distributed data processing node; and
send the second partial summary data to the unified node. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. One or more non-transitory computer-readable media storing one or more sequences of instructions which, when executed by one or more computing devices, cause:
-
receiving, at a unified data processing node, a continuous query; determining a parallel portion of the continuous query; sending the parallel portion to a plurality of distributed data processing nodes comprising at least a first distributed data processing node in a first data center and a second distributed data processing node in a second data center, wherein the first data center and the second data center are separate; the first distributed data processing node locally processing the parallel portion against a first independent data stream to produce a first partial summary data, and sending the first partial summary data to the unified data processing node, wherein the first independent data stream is not duplicated or processed by the unified data processing node or the second distributed data processing node; the second distributed data processing node locally processing the parallel portion against a second independent data stream to produce a second partial summary data, and sending the second partial summary data to the unified data processing node, wherein the second independent data stream is not duplicated or processed by the unified data processing node or the first distributed data processing node; continuously receiving, at the unified data processing node, in real-time, streaming data results comprising the first partial summary data and the second partial summary data and combining the first partial summary data and the second partial summary data in preparation for returning combined streaming query results to an application.
-
Specification