6.7
Table Of Contents
- vSphere Availability
- Contents
- About vSphere Availability
- Business Continuity and Minimizing Downtime
- Creating and Using vSphere HA Clusters
- Providing Fault Tolerance for Virtual Machines
- How Fault Tolerance Works
- Fault Tolerance Use Cases
- Fault Tolerance Requirements, Limits, and Licensing
- Fault Tolerance Interoperability
- Preparing Your Cluster and Hosts for Fault Tolerance
- Using Fault Tolerance
- Best Practices for Fault Tolerance
- Legacy Fault Tolerance
- Troubleshooting Fault Tolerant Virtual Machines
- Hardware Virtualization Not Enabled
- Compatible Hosts Not Available for Secondary VM
- Secondary VM on Overcommitted Host Degrades Performance of Primary VM
- Increased Network Latency Observed in FT Virtual Machines
- Some Hosts Are Overloaded with FT Virtual Machines
- Losing Access to FT Metadata Datastore
- Turning On vSphere FT for Powered-On VM Fails
- FT Virtual Machines not Placed or Evacuated by vSphere DRS
- Fault Tolerant Virtual Machine Failovers
- vCenter High Availability
- Plan the vCenter HA Deployment
- Configure the Network
- Configure vCenter HA With the Basic Option
- Configure vCenter HA With the Advanced Option
- Manage the vCenter HA Configuration
- Set Up SNMP Traps
- Set Up Your Environment to Use Custom Certificates
- Manage vCenter HA SSH Keys
- Initiate a vCenter HA Failover
- Edit the vCenter HA Cluster Configuration
- Perform Backup and Restore Operations
- Remove a vCenter HA Configuration
- Reboot All vCenter HA Nodes
- Change the Appliance Environment
- Collecting Support Bundles for a vCenter HA Node
- Troubleshoot Your vCenter HA Environment
- Patching a vCenter High Availability Environment
- Using Microsoft Clustering Service for vCenter Server on Windows High Availability
The Primary and Secondary VMs continuously monitor the status of one another to ensure that Fault
Tolerance is maintained. A transparent failover occurs if the host running the Primary VM fails, in which
case the Secondary VM is immediately activated to replace the Primary VM. A new Secondary VM is
started and Fault Tolerance redundancy is reestablished automatically. If the host running the Secondary
VM fails, it is also immediately replaced. In either case, users experience no interruption in service and no
loss of data.
A fault tolerant virtual machine and its secondary copy are not allowed to run on the same host. This
restriction ensures that a host failure cannot result in the loss of both VMs.
Note You can also use VM-Host affinity rules to dictate which hosts designated virtual machines can run
on. If you use these rules, be aware that for any Primary VM that is affected by such a rule, its associated
Secondary VM is also affected by that rule. For more information about affinity rules, see the vSphere
Resource Management documentation.
Fault Tolerance avoids "split-brain" situations, which can lead to two active copies of a virtual machine
after recovery from a failure. Atomic file locking on shared storage is used to coordinate failover so that
only one side continues running as the Primary VM and a new Secondary VM is respawned automatically.
vSphere Fault Tolerance can accommodate symmetric multiprocessor (SMP) virtual machines with up to
four vCPUs.
Fault Tolerance Use Cases
Several typical situations can benefit from the use of vSphere Fault Tolerance.
Fault Tolerance provides a higher level of business continuity than vSphere HA. When a Secondary VM is
called upon to replace its Primary VM counterpart, the Secondary VM immediately takes over the Primary
VM’s role with the entire state of the virtual machine preserved. Applications are already running, and
data stored in memory does not need to be reentered or reloaded. Failover provided by vSphere HA
restarts the virtual machines affected by a failure.
This higher level of continuity and the added protection of state information and data informs the
scenarios when you might want to deploy Fault Tolerance.
n
Applications which must always be available, especially applications that have long-lasting client
connections that users want to maintain during hardware failure.
n
Custom applications that have no other way of doing clustering.
n
Cases where high availability might be provided through custom clustering solutions, which are too
complicated to configure and maintain.
Another key use case for protecting a virtual machine with Fault Tolerance can be described as On-
Demand Fault Tolerance. In this case, a virtual machine is adequately protected with vSphere HA during
normal operation. During certain critical periods, you might want to enhance the protection of the virtual
machine. For example, you might be running a quarter-end report which, if interrupted, might delay the
vSphere Availability
VMware, Inc. 49