Coordinated Garbage Collection in Distributed Systems
First Claim
1. A system, comprising:
- a plurality of computing nodes, each of which comprises at least one processor and a memory, and each of which hosts one or more virtual machine instances, wherein each virtual machine instance has its own heap memory that is not shared with other virtual machine instances; and
a garbage collection coordinator;
wherein each of the virtual machine instances executes a respective process of a distributed application that communicates with one or more other ones of the processes of the distributed application executing on respective other virtual machine instances;
wherein the garbage collection coordinator is configured to;
determine that a collection should be performed on one of the virtual machine instances dependent, at least in part, on whether a collection is being performed, or is about to be performed, on one or more other ones of the virtual machine instances; and
initiate collection on the one of the virtual machine instances.
1 Assignment
0 Petitions
Accused Products
Abstract
Fast modern interconnects may be exploited to control when garbage collection is performed on the nodes (e.g., virtual machines, such as JVMs) of a distributed system in which the individual processes communicate with each other and in which the heap memory is not shared. A garbage collection coordination mechanism (a coordinator implemented by a dedicated process on a single node or distributed across the nodes) may obtain or receive state information from each of the nodes and apply one of multiple supported garbage collection coordination policies to reduce the impact of garbage collection pauses, dependent on that information. For example, if the information indicates that a node is about to collect, the coordinator may trigger a collection on all of the other nodes (e.g., synchronizing collection pauses for batch-mode applications where throughput is important) or may steer requests to other nodes (e.g., for interactive applications where request latencies are important).
38 Citations
20 Claims
-
1. A system, comprising:
-
a plurality of computing nodes, each of which comprises at least one processor and a memory, and each of which hosts one or more virtual machine instances, wherein each virtual machine instance has its own heap memory that is not shared with other virtual machine instances; and a garbage collection coordinator; wherein each of the virtual machine instances executes a respective process of a distributed application that communicates with one or more other ones of the processes of the distributed application executing on respective other virtual machine instances; wherein the garbage collection coordinator is configured to; determine that a collection should be performed on one of the virtual machine instances dependent, at least in part, on whether a collection is being performed, or is about to be performed, on one or more other ones of the virtual machine instances; and initiate collection on the one of the virtual machine instances. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method, comprising:
performing, by a plurality of computing nodes; beginning execution of a distributed application, wherein executing the distributed application comprises performing one or more particular operations that when performed on one of the plurality of computing nodes delay operations on one or more other ones of the plurality of computing nodes until it is complete; determining, during execution of the distributed application, that a given one of the plurality of computing nodes should perform one of the particular operations; and performing, by the given one of the plurality of computing nodes, the one of the particular operations; wherein said determining comprises applying a coordination policy that is dependent, at least in part, on whether another one of the plurality of computing nodes is, or is about to, execute one of the particular operations. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
17. A non-transitory, computer-readable storage medium storing program instructions that when executed on one or more computing nodes cause the one or more computing nodes to implement a garbage collection coordinator and an application programming interface for communication with the garbage collection coordinator;
-
wherein the garbage collection coordinator is configured to; receive information from each of one or more of a plurality of computing nodes on which a respective process of a distributed application is executing, wherein the respective process of the distributed application executing on each of the plurality of computing nodes communicates with one or more other ones of the respective processes executing on other ones of the computing nodes, and wherein the respective processes of the distributed application executing on each of the plurality of computing nodes do not share heap memory; and initiate a garbage collection on at least one of the plurality of computing nodes dependent on the information; wherein the information is usable to determine whether or not the computing node is ready to receive communications from other ones of the processes of the distributed application; wherein the information is received via an operation defined by the application programming interface; and wherein the garbage collection coordinator is configured to initiate the garbage collection via an operation defined by the application programming interface. - View Dependent Claims (18, 19, 20)
-
Specification