6.7

Table Of Contents

Solution

If DRS does not place or evacuate FT VMs in the cluster, check the VMs for a VM override that is

disabling DRS. If you find one, remove the override that is disabling DRS.

Note For more information on how to edit or delete VM overrides, see vSphere Resource Management.

Fault Tolerant Virtual Machine Failovers

A Primary or Secondary VM can fail over even though its ESXi host has not crashed. In such cases,

virtual machine execution is not interrupted, but redundancy is temporarily lost. To avoid this type of

failover, be aware of some of the situations when it can occur and take steps to avoid them.

Partial Hardware Failure Related to Storage

This problem can arise when access to storage is slow or down for one of the hosts. When this occurs

there are many storage errors listed in the VMkernel log. To resolve this problem you must address your

storage-related problems.

Partial Hardware Failure Related to Network

If the logging NIC is not functioning or connections to other hosts through that NIC are down, this can

trigger a fault tolerant virtual machine to be failed over so that redundancy can be reestablished. To avoid

this problem, dedicate a separate NIC each for vMotion and FT logging traffic and perform vMotion

migrations only when the virtual machines are less active.

Insucient Bandwidth on the Logging NIC Network

This can happen because of too many fault tolerant virtual machines being on a host. To resolve this

problem, more broadly distribute pairs of fault tolerant virtual machines across different hosts.

Use a10-Gbit logging network for FT and verify that the network is low latency.

vMotion Failures Due to Virtual Machine Activity Level

If the vMotion migration of a fault tolerant virtual machine fails, the virtual machine might need to be failed

over. Usually, this occurs when the virtual machine is too active for the migration to be completed with

only minimal disruption to the activity. To avoid this problem, perform vMotion migrations only when the

virtual machines are less active.

Too Much Activity on VMFS Volume Can Lead to Virtual Machine Failovers

When a number of file system locking operations, virtual machine power ons, power offs, or vMotion

migrations occur on a single VMFS volume, this can trigger fault tolerant virtual machines to be failed

over. A symptom that this might be occurring is receiving many warnings about SCSI reservations in the

VMkernel log. To resolve this problem, reduce the number of file system operations or ensure that the

fault tolerant virtual machine is on a VMFS volume that does not have an abundance of other virtual

machines that are regularly being powered on, powered off, or migrated using vMotion.

vSphere Availability

VMware, Inc. 67