Distributed computing system node management
First Claim
1. A computer-implemented method for exchanging information associated with a distributed computing system comprising:
- maintaining, by each of a plurality of compute nodes in the distributed computing system, a respective collection of latency information for communicating with one or more services executing on each other computing node in the distributed computing system, the plurality of compute nodes comprising a first compute node executing a first service and two or more other compute nodes that execute a plurality of at least partially redundant services, wherein the latency information is exchanged between the plurality of compute nodes using an epidemic protocol; and
using, by the first service on the first compute node, a first collection of the latency information maintained by the first compute node to select, from the plurality of at least partially redundant services, a second service executing on a second compute node with which to interact.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for distributed computing system node management are described herein. In some cases, internal compute nodes (i.e., compute nodes that are allocated to the distributed system) may be mutually trusted such that they may freely establish communications with one another. By contrast, external compute nodes (i.e., compute nodes that aren'"'"'t allocated to the distributed computing system) may be untrusted such that their access to the distributed system may be regulated. In some cases, one or more of the compute nodes within the distributed computing system may maintain respective collections of system view information. Each respective collection of system view information may include, for example, information associated with the corresponding compute node'"'"'s view of the distributed computing system based on information that is available to the corresponding compute node.
26 Citations
12 Claims
-
1. A computer-implemented method for exchanging information associated with a distributed computing system comprising:
-
maintaining, by each of a plurality of compute nodes in the distributed computing system, a respective collection of latency information for communicating with one or more services executing on each other computing node in the distributed computing system, the plurality of compute nodes comprising a first compute node executing a first service and two or more other compute nodes that execute a plurality of at least partially redundant services, wherein the latency information is exchanged between the plurality of compute nodes using an epidemic protocol; and using, by the first service on the first compute node, a first collection of the latency information maintained by the first compute node to select, from the plurality of at least partially redundant services, a second service executing on a second compute node with which to interact. - View Dependent Claims (2, 3, 4)
-
-
5. A distributed computing system comprising:
-
one or more processors; one or more memories to store a set of instructions, which if executed by the one or more processors, causes the one or more processors to perform operations comprising; maintaining, by each of a plurality of compute nodes in the distributed computing system, a respective collection of latency information for communicating with one or more services executing on each other computing node in the distributed computing system, the plurality of compute nodes comprising a first compute node executing a first service and two or more other compute nodes that execute a plurality of at least partially redundant services wherein the latency information is exchanged between the plurality of compute nodes using an epidemic protocol; and using, by the first service on the first compute node, a first collection of the latency information maintained by the first compute node to select, from the plurality of at least partially redundant services, a second service executing on a second compute node with which to interact. - View Dependent Claims (6, 7, 8)
-
-
9. One or more non-transitory computer-readable storage media having stored thereon instructions that, upon execution by one or more computing devices, cause the one or more computing devices to perform operations comprising:
-
maintaining, by each of a plurality of compute nodes within a distributed computing system a respective collection of latency information for communicating with one or more services executing on each other computing node in the distributed computing system, the plurality of compute nodes comprising a first compute node executing a first service and two or more other compute nodes that execute a plurality of at least partially redundant services wherein the latency information is exchanged between the plurality of compute nodes using an epidemic protocol; and using, by the first service on the first compute node, a first collection of the latency information maintained by the first compute node to select, from the plurality of at least partially redundant services, a second service executing on a second compute node with which to interact. - View Dependent Claims (10, 11, 12)
-
Specification