ACCELERATING STREAM PROCESSING BY DYNAMIC NETWORK AWARE TOPOLOGY RE-OPTIMIZATION
First Claim
1. A stream processing acceleration method employing network-aware routing, said method comprising the computer implemented steps of:
- representing topology link structures using route-maps wherein said route-maps include tuple-routing information and topology structure encoded therein;
applying, on-the-fly, the route maps into multiple operators using a topology route-map update method.
0 Assignments
0 Petitions
Accused Products
Abstract
Aspects of the present disclosure are directed to techniques that improve performance of streaming systems. Accordingly we disclose efficient techniques for dynamic topology re-optimization, through the use of a feedback-driven control loop that substantially solve a number of these performance-impacting problems affecting such streaming systems. More particularly, we disclose a novel technique for network-aware tuple routing using consistent hashing that improves stream flow throughput in the presence of large, run-time overhead. We also disclose methods for dynamic optimization of overlay topologies for group communication operations. To enable fast topology re-optimization with least system disruption, we present a lightweight, fault-tolerant protocol. All of the disclosed techniques were implemented in a real system and comprehensively validated on three real applications. We have demonstrated significant improvement in performance (20% to 200%), while overcoming various compute and network bottlenecks. We have shown that our performance improvements are robust to dynamic changes, as well as complex congestion patterns. Given the importance of stream processing systems and the ubiquity of dynamic network state in cloud environments, our results represent a significant and practical solution to these problems and deficiencies.
-
Citations
8 Claims
-
1. A stream processing acceleration method employing network-aware routing, said method comprising the computer implemented steps of:
- representing topology link structures using route-maps wherein said route-maps include tuple-routing information and topology structure encoded therein;
applying, on-the-fly, the route maps into multiple operators using a topology route-map update method.
- representing topology link structures using route-maps wherein said route-maps include tuple-routing information and topology structure encoded therein;
-
2. A method for accelerating stream processing in a network system through the effect of dynamic network-aware topology re-optimization, the method comprising the computer implemented steps of:
-
choosing, for every operator, a destination operator for outgoing tuples based on route maps wherein said route maps include information on the type and proportion of traffic for each destination operator; collecting, by a per-topology controller, a number of metrics pertaining to the network system; determining any bottlenecks in the network system; based on the determined bottlenecks, generating—
by the controller—
new route maps that minimize the maximum network and CPU utilization;installing the new route maps in a consistent manner on a running cluster in the network system using a light-weight atomic route-update protocol. - View Dependent Claims (3, 4, 5, 6, 7)
-
-
8. A method for accelerating stream processing in a network system through the effect of an atomic method for modifying topology route maps wherein each topology includes a controller and a spout, each having corresponding kafka queues which form their message sources, the method comprising the computer implemented steps of:
-
storing, by the controller, new route-maps in its local state and then sending a new route map message, tagged by a version-id, to the spout by placing it in the spout'"'"'s kafka queue; reading by the spout, the new route-maps message from its kafka queue, and then appending an install-routes command to the message and sending that appended message to all involved operators by piggybacking on a next start-batch tuple; receiving, by the spout, acknowledgements from all involved operators and in response sending a routes-installed confirmation message to the controller by placing it in the controller'"'"'s kafka queue; storing, by the controller upon receipt of the routes-installed message, the new topology route maps in its local state and then sending an activate new routes message to the spout by placing it in the spout'"'"'s kafka queue; upon receiving the activate new routes message, appending, by the spout, the received message onto a next start batch tuple; sending, by the controller, the piggybacked start-batch tuple after waiting for a successful commit of all currently executing batches; upon receiving all acknowledgements for the start batch tuple containing activate new routes message, sending by the spout an activated new routes message to the controller by placing it in the controller'"'"'s kafka queue; and marking, by the controller, a successful completion.
-
Specification