Specifications

29 Deploying VMware vCenter Site Recovery Manager with NetApp FAS/V-Series Storage Systems

9. If necessary and possible, isolate the primary site to prevent any conflicts if infrastructure services

should be reestablished without warning.

9 RESYNC AFTER RECOVERY

9.1 REQUIREMENTS AND ASSUMPTIONS

After a disaster has been overcome, it is usually necessary to return operations back to the primary site.

From a storage standpoint when considering the process of resyncing the environment back to the primary

site there are three high-level scenarios to consider:

A. Resyncing the environment if the primary storage has been recovered. This involves the use of the

SnapMirror resync process, which requires that the primary and DR NetApp volumes have a common

NetApp Snapshot copy that the SnapMirror software will recognize as a consistent point from which to

resync.

B. Resyncing the environment if the primary site storage was lost. In this case the primary storage might

have been destroyed, or a common Snapshot copy no longer exists between the primary and DR

volumes. This requires that the entire volume be initialized (a complete retransfer) back to the primary

site using the SnapMirror initialize function.

C. The third scenario is some combination of scenarios A and B, where loss of data might have occurred

for some of the volumes at the primary site, but not all of them. In this case either a SnapMirror resync

or SnapMirror initialize process would be done for each volume as appropriate.

The vSphere release of SRM provides no additional capability for automatic reversal and failback of storage

replication or virtual machines.

DR failback requires the same processes as DR failover, with the primary difference being that it occurs at a

preplanned time. To perform a proper, controlled failback, SRM should be used for failback by building

protection groups and recovery plans in the reverse order.

This method of failback provides the following advantages over manual failback or scripted failback and

brute force startup of virtual machines, all of which are as equally critical to a successful failback as they

were to a successful failover:

• It’s supported by VMware.

• Provides failback testing.

• Supports all storage protocols on all platforms.

• Provides virtual machine reconfiguration on failback, such as changes in IP addresses.

• Supports VMs using multiple datastores.

Note: VMFS datastores get resignatured by SRM on failover. To avoid broken links to virtual disks in

multiple datastores VMs must be properly reconfigured by SRM on failback.

• Allows configuration of startup order of VMs so dependencies can be honored.

Resyncing after a disaster recovery requires the following processes:

1. Isolating the original environment prior to powering it on. This might be necessary to prevent outdated

virtual machines from starting on the network, or any other conflicts with the DR environment that might

occur (required only for scenario A).

2. Recovering and/or replacing the infrastructure at the primary site.

3. Reestablishing network connectivity between the sites.

4. Reversing the SnapMirror relationships (resync or initialize).

5. Building reverse SRM relationships.

6. Scheduling an outage to perform a controlled failover.

7. Running an SRM recovery plan in the opposite direction as was done in the disaster recovery.

8. Reestablishing normal operations. This involves reversing the SnapMirror and SRM relationships again

to establish the original primary-to-DR site replication and protection.