MAINTAINING HIGH AVAILABILITY OF A GROUP OF VIRTUAL MACHINES USING HEARTBEAT MESSAGES
First Claim
1. A system for maintaining high availability of a plurality of virtual machines in a fault domain, the system comprising:
- a network communication interface configured to receive heartbeat messages from a plurality of hosts executing the virtual machines; and
a processor coupled to the memory and programmed to;
associate a datastore with a host, within the plurality of hosts, based on at least one of;
1) quantity of hosts that have access to the datastore,
2) whether the datastore is associated with the same storage device as another datastore, and
3) the file system type of the datastore;
determine whether the host is an unreachable host or an inoperative host based on the received heartbeat messages and heartbeat data stored in the datastore; and
restart a virtual machine that ceases executing on the unreachable host or executed by the inoperative host based on determining that the host is the unreachable host or the inoperative host.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments maintain high availability of software application instances in a fault domain. Subordinate hosts are monitored by a master host. The subordinate hosts publish heartbeats via a network and datastores. Based at least in part on the published heartbeats, the master host determines the status of each subordinate host, distinguishing between subordinate hosts that are entirely inoperative and subordinate hosts that are operative but partitioned (e.g., unreachable via the network). The master host may restart software application instances, such as virtual machines, that are executed by inoperative subordinate hosts or that cease executing on partitioned subordinate hosts.
6 Citations
27 Claims
-
1. A system for maintaining high availability of a plurality of virtual machines in a fault domain, the system comprising:
-
a network communication interface configured to receive heartbeat messages from a plurality of hosts executing the virtual machines; and a processor coupled to the memory and programmed to; associate a datastore with a host, within the plurality of hosts, based on at least one of;
1) quantity of hosts that have access to the datastore,
2) whether the datastore is associated with the same storage device as another datastore, and
3) the file system type of the datastore;determine whether the host is an unreachable host or an inoperative host based on the received heartbeat messages and heartbeat data stored in the datastore; and restart a virtual machine that ceases executing on the unreachable host or executed by the inoperative host based on determining that the host is the unreachable host or the inoperative host. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
associating a datastore with a host based on at least one of;
1) quantity of hosts that have access to the datastore,
2) whether the datastore is associated with the same storage device as another datastore, and
3) the file system type of the datastore;determining whether the host is an unreachable host or an inoperative host based on a received heartbeat message and heartbeat data stored in the datastore; and restarting a virtual machine that ceases executing on the unreachable host or executed by the inoperative host based on determining that the host is the unreachable host or the inoperative host. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer readable non-transitory storage medium storing instructions which when executed by a computer cause the computer to perform a method, the method comprising:
-
associating a datastore with a host based on at least one of;
1) quantity of hosts that have access to the datastore,
2) whether the datastore is associated with the same storage device as another datastore, and
3) the file system type of the datastore;determining whether the host is an unreachable host or an inoperative host based on a received heartbeat message and heartbeat data stored in the datastore; and restarting a virtual machine that ceases executing on the unreachable host or executed by the inoperative host based on determining that the host is the unreachable host or the inoperative host. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification