Ordered iteration for data update management

US 8,539,094 B1
Filed: 03/31/2011
Issued: 09/17/2013
Est. Priority Date: 03/31/2011
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of managing data in a networked environment, comprising:

under control of one or more computer systems configured with executable instructions,receiving workload data for a designated host server of a group of host servers selected to process the workload data for a customer, the group of host servers being connected by a network aggregation fabric including layers of network switches, a path across the aggregation fabric to each host server involving a number of connections across the network switches, the group of host servers being selected to process the workload data for the customer and being dispersed across a number of network switches for at least a lowest layer of the aggregation fabric;

routing the workload data to the designated host server and processing the workload data using the designated host server;

measuring one or more transmission patterns of the workload;

determining an ordering of other host servers in the group to which to send updates to the workload data based upon the measured one or more transmission patterns, wherein the determined ordering is selected in order to statistically minimize a likelihood of network congestion based on known transmission patterns of the workload, and each host server in the group capable of having a different ordering; and

in response to determining one or more updates to the workload data to be sent to the other host servers in the group, sending the updates to the other host servers according to the determined ordering,wherein updates to be periodically shared across all the host servers in the group are sent with determined orderings in order to reduce a statistical likelihood of network congestion due to flow convergence.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Host machines and other devices performing synchronized operations can be dispersed across multiple racks in a data center to provide additional buffer capacity and to reduce the likelihood of congestion. The level of dispersion can depend on factors such as the level of oversubscription, as it can be undesirable in a highly connected network to push excessive host traffic into the aggregation fabric. As oversubscription levels increase, the amount of dispersion can be reduced and two or more host machines can be clustered on a given rack, or otherwise connected through the same edge switch. By clustering a portion of the machines, some of the host traffic can be redirected by the respective edge switch without entering the aggregation fabric. When provisioning hosts for a customer, application, or synchronized operation, for example, the levels of clustering and dispersion can be balanced to minimize the likelihood for congestion throughout the network.

Citations

24 Claims

1. A computer-implemented method of managing data in a networked environment, comprising:
- under control of one or more computer systems configured with executable instructions,receiving workload data for a designated host server of a group of host servers selected to process the workload data for a customer, the group of host servers being connected by a network aggregation fabric including layers of network switches, a path across the aggregation fabric to each host server involving a number of connections across the network switches, the group of host servers being selected to process the workload data for the customer and being dispersed across a number of network switches for at least a lowest layer of the aggregation fabric;
  
  routing the workload data to the designated host server and processing the workload data using the designated host server;
  
  measuring one or more transmission patterns of the workload;
  
  determining an ordering of other host servers in the group to which to send updates to the workload data based upon the measured one or more transmission patterns, wherein the determined ordering is selected in order to statistically minimize a likelihood of network congestion based on known transmission patterns of the workload, and each host server in the group capable of having a different ordering; and
  
  in response to determining one or more updates to the workload data to be sent to the other host servers in the group, sending the updates to the other host servers according to the determined ordering,wherein updates to be periodically shared across all the host servers in the group are sent with determined orderings in order to reduce a statistical likelihood of network congestion due to flow convergence.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
- - 2. The computer-implemented method of claim 1, wherein the determined ordering is selected based at least in part upon an absolute ordering of Internet protocol (IP) addresses.
  - 3. The computer-implemented method of claim 1, wherein the determined ordering is selected based at least in part upon an absolute ordering of port numbers in a Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) addressing strategy.
  - 4. The computer-implemented method of claim 1, wherein adjacent host servers in the ordering are connected to different network switches in the lowest layer of the aggregation fabric.
  - 5. The computer-implemented method of claim 1, wherein adjacent host servers in the ordering are connected to a same host server-connected switch.
  - 6. The computer-implemented method of claim 1, wherein the aggregation fabric is in one of an oversubscribed or non-oversubscribed state.
  - 7. The computer-implemented method of claim 1, wherein the group of host servers are able to send updates in parallel.
  - 8. The computer-implemented method of claim 7, wherein the group of host servers are segmented into a first half and a second half, and wherein the host servers in one of the first and second halves are each able to send the workload data to the host servers in the other half at full rate and in parallel.
  - 9. The computer-implemented method of claim 1, wherein multicast or broadcast network traffic is converted to iterative unicast on the sending host server before transmitting to the aggregation fabric.
  - 10. The computer-implemented method of claim 1, wherein the aggregation fabric is in one of an oversubscribed or non-oversubscribed state.
  - 11. The computer-implemented method of claim 1, wherein the other host servers are not synchronized with the sending host server.
  - 12. The computer-implemented method of claim 1, wherein an absolute ordering is implemented in an Application Programmer Interface (API) library which is configured to manage network traffic.
  - 13. The computer-implemented method of claim 12, wherein the network traffic includes at least one of broadcast traffic and multicast traffic.
  - 14. The computer-implemented method of claim 12, wherein the API is implemented in a Message Passing Interface (MPI) or open message passing (OpenMP) library.
  - 15. The computer-implemented method of claim 12, wherein the API is implemented in an Internet Protocol (IP) library.
  - 16. The computer-implemented method of claim 1, wherein multiple flows of network traffic from the customer are able to be received concurrently to the aggregation fabric and directed to a common host server.
  - 17. The computer-implemented method of claim 16, wherein at least one of the multiple flows is capable of being throttled using a back-off control mechanism.
  - 18. The computer-implemented method of claim 17, wherein oversubscription is implemented at a network switch based at least in part on a statistical probability of an amount of network traffic in a route comprising the network switch.
  - 19. The computer-implemented method of claim 1, wherein host servers selected to process network data for the customer are distributed across as many network switches at the lowest layer of the aggregation fabric as possible.
  - 20. The computer-implemented method of claim 1, wherein host servers selected to process similar network traffic for the customer are distributed across a plurality of sub-networks.
  - 21. The computer-implemented method of claim 1, wherein at least one cluster of network switches for performing an operation is a network with re-arrangeably non-blocking bandwidth.
  - 22. The computer-implemented method of claim 21, wherein at least one cluster of network switches for performing an operation is a Clos topology network.
  - 23. The computer-implemented method of claim 1, wherein at least one cluster of network switches utilizes a network topology selected from a hierarchical router pair or an irregular ad hoc network topology.

24. A computer-implemented method of managing data in a networked environment, comprising:
- under control of one or more computer systems configured with executable instructions,receiving workload data for one of a group of host servers selected to process the workload data for a customer, the group of host servers being connected by a network aggregation fabric including layers of network switches, a path across the aggregation fabric to each host server involving a number of connections across the network switches, the group of host servers being selected to process the workload data for the customer and being dispersed across a number of network switches for at least a lowest layer of the aggregation fabric;
  
  determining an absolute ordering of the group of host servers selected to process workload data for the customer;
  
  routing the workload data to a currently selected host server in the absolute ordering and processing the workload data using the processing host server;
  
  measuring one or more transmission patterns of the workload;
  
  determining an ordering of other host servers in the group to which to send updates to the workload data based upon the measured one or more transmission patterns, wherein the determined ordering is selected in order to statistically minimize a likelihood of network congestion based on known transmission patterns of the workload, and each host server in the group capable of having a different ordering; and
  
  in response to determining one or more updates to the workload data to be sent to the other host servers in the group, sending the updates to the other host servers according to the determined ordering,wherein updates to be periodically shared across all the host servers in the group are sent with determined orderings in order to reduce a statistical likelihood of network congestion due to flow convergence.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Marr, Michael David
Primary Examiner(s)
WON, MICHAEL YOUNG

Application Number

US13/076,961
Time in Patent Office

901 Days
Field of Search

709/203, 709/231, 709/235, 709/201, 709/226, 709/241, 370/235, 370/412
US Class Current

709/235
CPC Class Codes

H04L 47/125 by balancing the load, e.g....

H04L 47/26 using explicit feedback to ...

Ordered iteration for data update management

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Ordered iteration for data update management

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links