White Papers
Table Of Contents
- 1 Introduction
- 2 VMware SRM terminology
- 3 Overview and prerequisites
- 4 Configuring array based replication
- 5 Installation and configuration of VMware SRM
- 6 SRM protection groups
- 7 Recovery plans
- 8 Testing
- 9 Recovery
- 10 Failback
- 11 Considerations for guest iSCSI connected volumes
- 12 Summary
- A Technical support and resources

33 Disaster Recovery with Dell PS Series SANs and VMware vSphere Site Recovery Manager | TR1073
10 Failback
Failback is the process that brings the recovered VMs at the DR site back to the original protected site after a
full recovery plan has been run. There can be multiple reasons for enacting the full recovery plan and moving
production VMs from the protected site to the recovery site; anything from power outage, equipment outage,
planned migration, to a true disaster. In each of these cases, careful consideration must be given to bringing
the existing environment back onto the original protected site.
Regardless of the reason that the recovery site is now servicing production VMs, there are two basic
scenarios for utilizing failback: The original SAN on the protected site is still in service and has some subset of
data from the production environment before the failover or the SAN is completely new because it is new
hardware or has been re-initialized. There may even be an instance where both of these techniques are used
depending on the reason for failover. Site Recovery Manager provides the ability to failback to the original
protected site using a process called reprotect. Reprotect is only available when the original protected site
and the associated data is still available.
With careful planning, bringing the recovery site virtual environment back into production on the original
protected site can happen with very little downtime.
During planning, the role of protected site and recovery site may change. This section denotes site A as the
original protected site that had data in production and site B as the original recovery site as the fail-over
destination.
10.1 Recovery scenario 1: Reprotect and failback
The first scenario is where the original protected site A still has a functioning SAN with some subset of data.
The failover could have been invoked due to a planned hardware outage or unplanned power failure, but
nothing involving the underlying server and storage environment. While disaster recovery failovers are seldom
planned, the failback process can be planned and controlled to ensure that there is no loss of data and
minimum disruption.
A controlled failback is when the administrator has the time and ability to schedule downtime and prepare for
failing back from site B to site A. Administrators can take their time devising a strategy to migrate back to
site A with all of the current data that was written since the failover occurred. Because failback is done at the
volume and datastore layer, administrators need to ensure that all of the VMs that reside on the volume are
shut down to insure data consistency. Also, if there are VMs that span multiple volumes, all of these volumes
need to be failed back at the same time to guarantee the VMs operation back at site A.
10.1.1 Reprotect
SRM provides the ability to failback to the original protected site using a process called reprotect. Reprotect
automates the process of re-establishing the replication going from the array at site B (the recovery site) back
to the array at site A (the protected site). Reprotect does not failback, but configures everything so that you
can test going back to the original protected site A, and if testing proves successful, then do a planned
migration. During the reprotect, fast failback can shorten the time period of the replication sync back. From a
high level the process is as follows:
1. Demote the protected site A volume to an inbound replica set.